Journal of 


Experimental Psychology 


ArtHuR W. Me ton, Editor 
Department of Psychology, University of Michigan 
Ann Arbor, Michigan 


Davip A. Grant, Associate Editor Devos D. Wickens, Associate Editor 
University of Wisconsin Ohio State University 


Consulting Editors 
E. James Arcuer, University of Wisconsin Luiovy G. Humpuereys, University of Illinois 
Jupson S. Brown, State University of lowa 
Cretus J. Burxe, Indiana University 
Paut M. Fitts, Ohio State University 
Frepericx C. Frick, 

Massachusetts Institute of Technology p ap Fi P 
Feawx A. Geipaap, University of Virginie Kennetu MacCorgvopare, University of Minnesota 
James J. Gisson, Cornell University Quinn McNeman, Stonford University 
Crarence H. Granam, Columbia University L. Stantinc Rew, University of Virginia 
Harotp W. Hake, University of Illinois Kennetn W. Spence, State University of lowe 


Artuur L. Inton, Tulane University 

Howarp H. Kenner, New York University 

Dowarp B. Lixpstey, ° 
University of California, Los Angeles 





CONTENTS 


On the Use of Inconsistency of Preferences in Psychological Measurement: C. H. Coomss 
Methodological Aspects of Auditory Threshold Measurements: J. F. Corso anp A. Conen 
An Analysis of Positioning Movements and Static Reactions: E. A. FizisHmMan 


5 Sere Senate ond Descection as 2 Punction of intevtack Rewence Sial- 
larity: M. Grapis anp H. W. Braun 


A Goalless Gradient: A. C. Pzresoom 
Two-Category Judgments of Sequences of Stimuli of Two Values: C. DeSoto 
Factors in Individual Improvement in Solving Twenty-Questions Problems: W. L. Faust 


Effect of Brightness of Simultaneous Visual Stimulation on Absolute Auditory Sensi- 
tivity: R. F. Tompson, J. F. Voss, anp W. J. Brocpen 


Differential Conditioning and Intensity of the UCS: 
W. N. Runguist, K. W. Spence, ann D. W. Srusss 


Motus Verbal Similarity as a Determinant of the Generalization of a Conditioned 


T After Training With Single Versus Multiple Tasks: C. P. Duncan 


a os o Waeus Se Signal Preceding a Noxious Stimulus on Verbal Rate and Heart 


Utility of Grades: Level of Aspiration in a Decision Theory Context: 
S. W. Becker anv S. Siecer 


The Empirical Validity of Equal Discriminability ian 
EA. Attuts1 ano R. C. Smorsxy 


Supplementary : Interlist Interference and the Retention of Paired Consonant Syl- 
lables: B. J. Unperwoop anv J. RICHARDSON 
} 





“American Psychological Association 
Vol. 55 No. 1 January 1958 











Artuur C. HorrmMan, Managing Editor 
Heven Orr, Circulation Manager 
Savize J. Doyze, Editorial Assistant 


The JOURNAL OF EXPERIMENTAL PsyYCHOLOGY is published monthly, 
two volumes per year, by the American Psychological Association, Inc., 
at Prince and Lemon Sts., Lancaster, Pa. The subscription rate per volume 
is $8.00, or $16.00 annually. Single copies are $1.50. Subscriptions, orders, 
address changes, and business communications should be addressed to the 
American Psychological Association, Inc., 1333 Sixteenth St. N.W., Wash- 
ington 6, D. C. 


This JOURNAL publishes original experimental investigations which are 
intended to contribute toward the development of psychology as an experi- 
mental science. Studies with normal human subjects are favored over 
studies involving abnormal or animal subjects, except when the latter are 
specifically oriented toward the extension of behavior theory. Experimental 
psychometric studies and studies in applied experimental psychology or 
engineering psychology may be accepted if they have broad implications 
for experimental and theoretical psychology. 


Normally, articles exceeding 20 printed pages cannot be accepted. Within 
this limit, the piecemeal experiment-by-experiment reporting of psychological re- 
search is discouraged, and the reporting of the data and conclusions of substantial 
segments of research efforts is encouraged. Specifically, an integrated series of 
studies accomplished simultaneously (e.g., most doctoral dissertations) must be 
presented in a single article. Special provision is made for the publication of 
Supplementary Reports and Replication Reports in articles of not more than 1.5 
printed pages (see Editor’s Note, this JouRNAL, 1957, 53, 1-2). 


Address all articles submitted for publication to the Editor: Arthur W. Melton, 
Department of Psychology, University of Michigan, Ann Arbor, Michigan. Manu- 
scripts must adhere to the conventions concerning reference citations, preparation 
of tables and figures, manuscript format, etc., as described in the Publication 
Manual of the American Psychological Association. When in doubt about prac- 
tices in this Journal, authors should examine a recent issue. 


Articles are published in the order of their receipt, except in rare circumstances. 
Authors are supplied with 50 free offprints without covers. All of the cost of an 
author's alterations in galley proof is charged to the author. Priority in publi- 
cation is given to articles whose authors assume the full cost of publication. In 
1958 the publication cost is $20 per printed page. Authors of priority publications 
receive no free offprints. 


To save printing costs supplementary material to articles in this Journal are 
deposited with the American Documentation Institute. Copies may be obtained 
by ordering the Document Number given in the footnote reference to ADI. Or- 
ders should be sent to ADI Auxiliary Publications Project, Photoduplication 
Service, Library of Congress, Washington 25,D.C. Orders must be accompanied 


by checks or money orders payable to the Chief, Photoduplication Service, Library 
of Congress. 





eet yg ty - mulling at the 1937, of te pet, in Lancaster, me, ate & a S 
March a 4 ce for rate provided f paragraph 
Section 34.40, P. & R. of 1948, ized ih 11, 1 “4 


att address changes to 1333 Sixteenth St. N.W., Washington D. C. Address changes must 
the Subscription Office by the 10th of the month to take effect a — > month. Unde 
resulting from address changes will not be replaced; subscribers should notify the post office that they 


guarantee second-class forwarding postage. Other claims for undelivered copies must be made w 
months of publication. 


© 1958 by the American Psychological Association, Inc. 








Journal of 








ON THE USE OF 


INCONSISTENCY 


OF PREFERENCES IN 


PSYCHOLOGICAL MEASUREMENT! 


CLYDE H 


l nice 


judgments has 
one of the 


manifestations 


Incons stency of 
long been behavioral 
founda- 
ion stone for psychological measure- 
ment. Fechner (1) 
(4) built 
psychological measurement based on 
A funda- 
mental assumption common to both 


workers 


serving aS a 
and Thurstone 


theories and systems of 


measures of inconsistency. 


is that the degree of incon- 
sistency is monotonically related to 
psychological distance. Fechner fur- 
ther that “equally often 
noticed differences are equal” and 
Case V of Thurstone’s Law of Com- 
parative Judgment makes the same 
assumption further as- 
sumes in his @ (y) hypothesis that 
the function 
to psychological distance is the in- 
tegral of a normal 


assumed 


Thurstone 


relating inconsistency 


curve These 


Carried 


Research 
msterdam, 
Director 
’ generously made 
ry available, and 
his students con 
before and 
after the experiment. he author is particularly 
ndebted to Mr. v.d. roecke who served as 


research assistant 


COOMBS 


both been used ex- 


tensively for the purpose of scaling 


systems have 

stimuli-on some attribute which they 

have in common 
Stevens (3) 


measures of inconsistency 


that 
(e.g., just 
differences) do not 


has concluded 


noticable con- 
stitute equal units of psychological 
magnitude on the kinds of perceptual 
continua he calls Class I or prothetic. 
This class includes brightness, loud- 
ness, heaviness, length, duration, etc. 
On these continua the psychological 
magnitude, as determined by such 
methods as fractionation and direct 
magnitude estimation, turns out to 
be a power function of the stimulus 
magnitude, whereas the JND scale 
approximates a logarithmic function. 

However the case may be in such 
psychophysical problems, the recent 
development of a theory about 
preferential choice (2) the 
question of the relation of inconsist- 


raises 


ency to psychological distance for 
such data. The theory leads to the 
inference that inconsistency of pref- 
erential judgments is not monotoni- 
cally related to psychological distance 


and in particular is a function of two 








5 


CLYDE H 


SA205 2S 


distribution of stimuli 
listribution of the 


Fic. 1. Discriminal 
A, B. . G) and a 
individual's ideal (1) 


variables (as will be shown), one of 
which is psychological distance, and 
that the relation is monotonic only if 
the second variable (here 
laterality) is held constant. 


called 


Consider the case of a unidimensional 
latent attribute generating preferential 
choices. 
having a 


are conceived as 
distribution of 
this 


Ww he se 


The stimuli 
discrimina! 
processes * attribute; and the 
individual preferences being 
obtained is also conceived as having a 
distribution of ideals 


on 


are 


(Fig. 1), an ideal 
being a point from which the individual! 
evaluates the stimuli and states as his 
preference the stimulus which is nearer 
his ideal. Thus, for any judgment ot 
preference, we consider that the indi 
vidual has an ideal point, each stimulus 
is represented by a point drawn trom 
its distribution, and the judgment re 
flects which stimulus 
the ideal point.* 

Let the term unilateral pair signity a 
pair of stimuli whose discriminal distri 
butions are both the same side of 
the scale relative to the distribution of 
ideals, and let the term di/ateral pair 
signify a pair of stimuli whose discriminal 
distributions are on opposite sides of the 
distribution of ideals. There may be 
stimuli whose discriminal distributions 
overlap the distribution of ideals but 
these will be neglected in the following 
treatment. 


point is nearer 


on 


2 Discriminal processes is Thurstone’s term 
for what might synonymously be called perceived 
magnitudes. 

3 The an tdeal introduced here 


is not to be confused with the terms anchoring, 


concept of 


} 
frames of reference, or adaptation level as used 
} 


in categorical rating studi The ideal repre 
sents a hypothetical stimulus which S would 
prefer to all other stimuli and which are them 
selves preferred in order of their 
distance from the ideal. 


ies. 


decreasing 





COOMBS 


The inconsistencies of an individual's 
preferences between unilateral pairs as 
compared with bilateral pairs will be of 
a different order of magnitude according 
to this model. In the case of unilateral 
pairs, only the overlap of the discriminal 
distributions of the stimuli will generate 
inconsistency, whereas the inconsistency 
of judgments bilateral pairs 
will be generated by the variance of the 
individual’s distribution of ideals as well 


between 


as by the discriminal dispersions 

This may be visualized more clearly 
if the individual is seen as folding ‘this 
scale at his ideal point, and, as this told- 
Ing point successive 
judgments, the discriminal distributions 
of unilateral pairs will move nearer to 
or farther from him in unison, 
those for bilateral 
opposite directions. 


varies between 


whereas 


pairs will move in 


this 
transformations 


The implication of model is 
that the of incon- 
sistency measures into psychological 
distance measures must be different, 
depending upon whether the incon- 
sistency is between unilateral pairs 
or bilateral pairs. The experiment 
reported here was designed to test 
this prediction. 


\leruop 


Subjects. two male and two 
female psychology students at the University 
f Amsterdam; all were naive with respect to the 


-xpermer 1 probler 
expe nental probiem 


Stimult and apparatus The stimuli were 12 
grey chips prepared by a commercial photog 
rapher by exposing photographic paper for 
different periods of time. They varied in 


arbitrary steps from almost white to almost 
black. 

An apparatus was constructed to present the 
stimuli in sets of four at a time in a rectangular 
arrangement of with 3-in. di- 


between centers (not diago 


circular chips 
ameters and 9 in. 
nally). The through a 
16-in. square mat, painted white, in which the 
four 3-in. circular holes were cut. The rest of 
the apparatus was painted black. There was 
a sliding cover which was closed between pres 
entations. Inside the apparatus were four 
large aluminum discs on which the chips were 
pasted. The discs could be rotated from the 


chips were exposed 








PSYCHOLOGICAL MEASUREMEN | 


mbination of four stimul 


equence.-There are 495 combina 
stimuli taken four at a time. I 


presentation 


le was GFEDHICBJ AK! 


resentat 
rank 


entatiy 


ecord th 
yrovided. 
Treatment 
Ss were anal 
experi 
»> sec 
muli 


and fr 

companson on 

for each pair the relative fr 
one member of a pair is pre 
The Ww rey lications n ¢ 
tt be intery reted 


comparisons because 


HJ, ICB, ete 


i the two stimuli 


a time which imposed 
bedded six pai 

Information theory ‘ 
puting the amount of information in these 


rank order 


Hm eae Oy imu md he ie. e.g... DHC, 


replications: in any given set of four stimul 


the number lered 


4! = 24. 


is log224 = 4.58 bits per presentation. Thess pe f preferences 


bab 


are distributed over six paired comparisons so i d be at least as great 








4 CLYDE H. COOMBS 


TABLE 1 


INCONSISTENCY OF PREFERENCES FOR Eacu oF THE Four Supyects 


1 2 3 4 5 | 6 7 s 9 10 11 12 
! 
1 §2 Sl 67 59 72 61 6468 |} 76 84 O91 &8 89 94 99 90°) 100 94 100 100) 100 100 
70 53 | 91 63 87 66 | 74 86 | 8079 94 97) 9392 100 94/100 97, 100 99 100 99 
2 83 52 83 77 7496) 8097 | 98 96) 99 8? 100100 100 © 100 100 100 100 
68 62 7862 | 5287 7387 94 90 9996 97 99/100 93 100 99° 100 100 
3 50 86 | 67 93) 64.99 | 92100 97 87) 97 100 100 100) 100 100) 100 100 
53 54 5163 6068 | 89 94 87 80/100 89) 93 97 9 9 100 97 
4 61 93 | 6398 90 99 100 80 91 100 100 100 100 98 100 100 
53 64 57 61 | 93 98 83 80 100 8&2 97 100) 100 100 100) 99 
5 68 96 SO 99 7267 100106 99 99 100100 100 100 
63 Si 76 OO 97 93 99 «99 99 8&7 100 96 100 99 
6 54 59 6650 97 8&8) 99 99/100 50 100 100 
80 71 9294 8&8 99 100 84 100 96 100 99 
7 9) 56 94 #943 100 99 100 96 100 100 
54 57 90 63 82 94 89 100 100 91 
8 8% 53 100 G67 94 99 100 99 
64 93 94 68 99 93) 100 100 
9 A 9) 99 94 100 100 
52 5 76 «84 «100 100 
10 62 97 97 100 
97 100 100 73 
11 100) 80 
99 $3 
Note.—The upper left hand corner of each cell is S #1; the upper right, S #2; the lower left, S #3; and the lower 
right, S #4. The labels of the rows and columns from 1 to 12 are as follows for each of the 4 Ss 
S#1:GFEDHICBIAKI 
S#2:JIHGFEDKCBLA 
S#3:FGEDHICIBKLA 
S2#4:GFHIDEJCBKLA 
of the first over the second, or the second over f the stimuli; whereas the measurement of 
the third, within the reliability with which the — inconsistency on a unilateral pair is a funct 
percentages were determined. The theory nly of the discriminal dispersions of the st 
proposed here asserts that this will not always 
be the case; in particular it will tend not to — 
Rest LTS 


hold for bilateral adjacent triples; in general 
it should hold for unilateral triples; and it 
will hold for bilateral split triples especially. 

Consider the following illustration of a 
bilateral adjacent triple from the formerexample, 
ICB. This is the rank order of preference, 
hence, the distance from I to B should be at 
least as great as either I to C or C to B. If 
we were to find that the percentage of preferences 
of I preferred over B (a bilateral pair) was 66% 
and the percentage of preferences of C over B 
(a unilateral pair) was 92%, such data would 
violate the hypothesis of a monotone trans- 
formation of inconsistency into psychological 
distance. Such results are anticipated, however, 
where I and B are a bilateral pair and where 
C and Barea unilateral pair, because the meas- 
ure of the inconsistency on a bilateral pair is a 
function of the variance of the distribution of 
ideals in addition to the discriminal dispersions 


Rank orders of preferences.—Table 
1 contains the inconsistency data 
of the four Ss. Each entry is the 
percentage of times that S preferred 
the stimulus corresponding to that 
row over the stimulus corresponding 
to that column. This table is a 
merger of the separate tables of the 
four Ss. It will be noted that every 
entry above the diagonal is at least 
50°). The stimuli are labelled from 
A to L in order of decreasing bright- 
ness. The fact that a permutation 
of each table exists such that all the 
entries on one side of the diagonal 














PSYCHOLOGICAL 


are at least as great as 50°; implies 
that the 66 paired comparisons are 
transitive for each S, and the order 
of the columns corresponds to the 
rank of that S’s 
preferences, his I scale. 


order dominant 
Inspection of these rank order I 
that of them 
can be obtained by folding the stimu- 


scales reveals three 
lus scale of greys, satisfying a con- 
dition for unidimensionality of the 


latent attribute. The one exception 
is that of the fourth S, for whom 
stimuli ID) and E are reversed. It 


may be seen from Table 1 that this 
S preferred D to EF 51°; of the time 
which represents a split of 46 to 44 
out of the 90 replications. Looking 
at the data from all the Ss on this 
pair of stimuli, it is evident that they 


were very close together and _ this 
reversal is not significant. 
Test of monotonicity.—Given the 


I scale of each S, a breakdown into 
unilateral triples bilateral 
jacent triples was made, and they 
many of 
necessary 
of a 


incon- 


and ad- 
were examined to see how 
nd satisfied the 
for the existence 
transformation of 
into psychological distance 


each k 
condition 

monotonic 
sistency 


“wW 


MEASUREMEN1 


If the distribution of the individual’s 
ideals overlaps the discriminal distri- 
butions of stimuli, the definition of 
unilateral and bilateral pairs involving 
such breaks down. Conse- 
quently, the first two stimuli on each I 
scale were dropped in the following 


stimuli 


analysis and the results based on the 
The 45 paired 
comparisons that remain for each S 
into those which 


remaining 10 stimuli. 


were divided 


were 
between bilateral pairs and_ those 
between unilateral pairs. The bi- 


lateral pairs were then examined to see 
how of them violated 
tonicity in a bilateral adjacent triple, 


and the unilateral pairs were examined 


many mono- 


to see how many of them violated 

monotonicity in a unilateral triple. 
The results for each S are presented 

as four-fold tables in Table 2 


x? tests of significance. 


with 
Obviously, 
the combined results are highly sig- 
nificant. 

A given paired comparison enters 
into 


many triples and hence may 


satisfy or violate monotonicity in 


more than one. There is, of course, 
some dependency among such figures 


statistical 


so no test is available, 
TABLE 2 
Revation or Laterariry to Monoronicity 
Subject 
> 3 4 
— — Mon Mon Mon M m Mon Mon 
Satisfied re Satisfied Not Satisfied Not Satisfied Not 
Satisfied “ * Satisfied | ~*" Satisfied — Satisfied 
Bilateral 
adjacent 15 10 q 7 12 13 f 19 
triples 
Unilateral 
triples 19 ] 27 2 IS 2 14 ¢ 
r 02 = 0] <0] <.01 

















CLYDE H. COOMBS 


TABLE 3 


NumBer oF Eacu Kinp or Tripte anp NumBer Vio.tatinc Monoroniciry 


| Bilaterals, Split 


Subject No 


Total | No. Violating 
No. Monotonicity | 
l 44 0 
, 24 {) | 
3 3K ( 
{ 38 0 
Combined 134 0 


but it is instructive to see the figures 
anyway. In Table 3 is presented a 
count of the number of each of the 
three kinds of triples (bilaterals split, 
unilaterals, and bilaterals adjacent) 
there were for each S and the number 
of such triples which violated mono- 
tonicity. The theory predicts that 
the frequency of violations should 
increase 


from with 


bilaterals split to a common event 


a rare event 


with bilaterals adjacent. 


Discussion 
These results, while they clearly 
indicate that the relation of inconsistency 
to psychological distance is dependent 
upon laterality, by no means imply that 
inconsistency cannot be used to measure 
psychological distance. Given that 
monotonicity is satisfied with laterality 
held constant, then there may exist a 
valid transformation of measures of 
inconsistency into measures of psycho 
logical distance. The different trans 
formations for different laterality would 
still presumably to the same 
family of curves. For example, if a 
normal curve transformation were suit 
able such as is used with the Law of 
Comparative Judgment and SD of the 
distribution of differences is used as the 
unit of measurement, then there is 
reason to expect that this 'D has one 


belong 


Tot 
No 


Unilaterals Bilaterals, Adjacent 
al No. Violating Total | No. Violating 
| Monotonicity No Monptonicits 

) ] it) 18 

) 2 40 | 4 

) ? 6? 9 

) 10 2 $8 

15 230 119 


value for unilateral pairs and a larger 
value for bilateral pairs. It would be 
like measuring psychological distance 
with a foot rule for unilateral pairs and 
with yard stick tor bilateral pairs. 
Some interesting theoretical and experi- 
mental problems arise here. 


a 


SUMMARY 











rat 45 i iv 
surement of psychological 
e, with the JND concept 
$ experiment is an in 
relation of measure f 
n s y to psy logical distance in the 
context of preferential choice as distinct. from 
the context of psychophysical discrimination 
A the was tested whic 
f inconsistency of pre 
tonically related t 
inie 1 the cor t 
eld « stant \ seri 
were sented in set I ‘ mm 
f four Ss who indicated their preferences 
among the stimuli as best representing their 
mcept of grey. Each S’s concept of grey 
turned out to be an intermediate grey with 
greys on one side and darker greys on 
the other. The hypothesis was tested that 
inconsistency between pairs of greys whose 
members were both on the same side of him 


would be of a different order of magnitude than 
nconsistency between pairs whose members 
came from opposite sides of him. Each § 
constituted a separate experiment and the hy- 


The 


converting inconsistency 


‘thesis was sustained for each of them. 
significance of this for 
measures into psychological distance measures 


was discussed. 











1. 


Borne, 







PSYCHOLOGICAI 


REFERENCES 








MEASUREMENT 


Stevens, S. S 
Psychol. Reo 


VEN \ 





Tuurstone, L 


igment 
173-286 


-ptua 
7, 54, 377 


7 


On the psychophysical law 


, 1957, 64, 153 
»& 


& GALANTER 


continu 
7-411. 
iL. 
Psych 


181. 
H. 





Journal of Experimental Psychology 
Vol. 55, No. 1, 1958 





METHODOLOGICAL ASPECTS OF AUDITORY 
THRESHOLD MEASUREMENTS ! 


JOHN F. CORSO AND ALEXANDER COHEN 


The Pennsylvania State University 


The history of psychophysics re- 
veals numerous attempts to establish 
the absolute intensive threshold for 
hearing. Recent research, however, 
has tended to augment these data 
with information on the variability 
of threshold measurements as a func- 
tion of time (e.g., 2, 6, 11). The 
critical relevance of shifts in sensi- 
tivity in the formulation of modern 
theories of hearing has already been 
emphasized (3). 

Threshold changes at a_ specific 
frequency for a given S may arise 
from several classes of variables: 
(a) physical, (4) physiological, (c) 
psychological, and (d) methodological. 
Within each of these classes there 
are specific factors operating, the 
effects of which in most cases have not 
yet been fully assessed. 

The present study consisted of two 
experiments on methodology. Ex- 
periment I investigated the magnitude 
of inter-S and intra-S variability in 
measurements of the intensive audi- 
tory threshold; Exp. II was concerned 
with the effects of practice on thresh- 
old values. 


EXPERIMENT | 


Method 


Subjects.—There were 38 Ss (18 male; 20 
female) in the experiment. ‘They ranged in 
age from 18 to 24 yr., and all were members of 
an introductory course in psychology. None 








1 This research was supported by the United 
States Air Force under Contract AF 33(616)- 
2626, monitored by the Bio-Acoustics Branch, 
Aero-Medical Laboratory, Wright Air Develop- 
ment Center, Ohio. Permission is granted for 
reproduction, translation, publication, use, and 
disposition, in whole and in part, by and for 
the United States Government. 


had served previously in any experiment in- 
volving psychophysical judgments. All Ss 
showed a negative otological history? and a life 
history of minimal exposure to noise as deter 
mined from a standardized questionnaire 
The Ss were individually 


Procedures. tested 
in a soundproof room equipped with a two-way 
voice communication system and a closed-circuit 
monitoring 


testing procedure followed the method 


television system. The general 
f limits 
in which the stimulus intensity was systemati- 
The 
app! vach to the threshold region was made in 
5-db steps. When the stimulus intensity 
adjusted in 5-db steps, only one 


cally varied in 2-db steps near threshold 


was 
tonal I ilse 
was presented; for the 2-db steps, three pulses 
were presented. The complete test for one 
frequency consisted of two descending and two 
ascending series in alternation 
The absolute threshold for a giver 

was taken as the if the rec 
Series 1 through 
two of the three tonal 


value was then c 


mean 
loss values for 


heard 





mverted to an absolute sound 
| level (SPL) re .0002 dyne/cm.? on 
the basis of the prior sound pressure calibration 
of the output into a National 
Bureau of Standards Coupler 9-A in ace 
with standardized procedures (1). Zer 


ressure 


audiometer 
rdance 
» deci 
bels on the audiometer was adjusted to ap 
proximate the minimum audible pressures of 
Sivian and White (14) 

Both right and left ears of each 
tested in an alternating 
tests were administe-ed on a Beltone audiom- 
eter, Model 10-A, adapted for Permoflux 
PDR-8 earphones set in MX-41/AR cushions. 
The frequencies tested were always presented 
to each ear in the following order: 1000, 500, 
250, 1000, 2000, 3000, 4000, 6000, and 8000 eps.* 

Experimental conditions.—All Ss were tested 


S were 
order between Ss. All 


2?The otological examinations were kindly 
performed by Dr. H. R. Glenn, Director, and 
Dr. E. S. Krug, Assistant Physician, of The 
Health Service, The Pennsylvania State Uni- 
versity. 

3 The initial threshold measurement at 1000 
cps was considered a practice trial, and the 
mid-test 1000 cps threshold value was taken as 
the “correct” value in all subsequent computa- 
tions. 














AUDITORY THRESHOLD MEASUREMENTS 9 


at all frequencies on three different days. 
Difficulties in meeting both class and experi- 
mental hours, however, made it impossible to 
fix an exact number of hours between tests. 
The mean time elapsed between Tests 1 and 2 
was 97 of 45 to 193 hr.; the 


time elapsed between Tests 2 and 3 
was 138 hr., with a range of 42 to 433 hr. 


hr., with a range 
mean 


Results 


The threshold data for both right 
and left ears at each frequency were 
treated by conventional analysis of 
variance techniques, with the total 
sum of squares divided into three 
components. The S’s score in each 
cell of the analysis of variance tables 
was the mean threshold SPL in 
decibels for Trials 1 through 4 at a 
given frequency. Thus, each S had 
a single threshold score for each ear 
for each of three audiometric tests 
at nine frequencies between 250 and 
8000 cps, inclusive. 

The results of the 18 analyses 
(nine frequencies for each ear) showed 
that only for 1000 cps on the right 
ear was there a significant F-ratio 
for the between tests (intra-S) source 
of variance; but the between-S (inter- 
S) source of variance was significant 
at all frequencies tested, excluding 
4000 and 6000 cps on the left ear. 

Since in each analysis the columns 
(Tests 1, 2, and 3) could be taken 
to represent equivalent forms of the 
same test, the interaction variance 
was considered to be the error vari- 
ance of a single score and (a7, — 07,5) 
‘o*, was computed as the reliability 
coeficient of the mean scores of Ss 
on the various tests. These values 
ranged from .87 at 250 cps to .95 at 
8000 cps on the right ear and from 
42 at 6000 cps to .87 at 1000 cps 
on the left ear. For each frequency, 
the reliability coefficients were con- 
sistently higher for the right ear, 
despite randomization in theforder 
of testing between ears. Except for 


TABLE 1 
ComBinep TuresHoip Data For 
Aupitory Tests 1, 2, anp 3 


j 








quene y 
in cps 
Mean SD SD 
Right Ear 
250 44 25.19 5.72 264 
500 38 10.50 $35 2 48 
1000 3 5.41 4.82 214 
1500 37 74) | $07 199 
2000 oT, 7.08 $ 56 195 
3000 35 10.56 7% 23] 
4000 34 13.21 6.44 | 2.65 
ean) 37 24.99 10.82 349 
SOK) 3A 21.43 | 12.20 3 88 
Left Ear 
— : : ) 
250 38 25.91 | 7.69 3.19 
300 3K 965 | 5.96 3.45 
1000 38 4.84 5.21 2.40 
1500 34 609 | 538 2 69 
2000 33 6.18 | 7.25 | 3.73 
3000 8) 11.80 7.97 4.95 
co) 33 12.28 | 7.74 | 5.08 
O00 rie) 25.7] £41 619 
vena 3 23.7¢ 10.51 6.3) 


* The N's in Table 1 are not all equal to 38 due to 
the loss of data resulting, primarily, from equipment 
problems 


the unaccountably low value of 2.4 
at 6000 cps on the left ear, the coef- 
ficients were fairly consistent for the 
frequencies within each ear. 

Table 1 presents the combined 
mean threshold SPL’s for the three 
audiometric tests at each frequency, 
together with the combined inter-S 
SD obtained by averaging the vari- 
ances of the three tests at each 
frequency and extracting the square 
root. Also, for each frequency, there 
is given the average intra-S SD 
obtained by computing the variance 
for each S across audiometric Tests 
1, 2, and 3 and then finding the square 
root of the average variance for the 
groups of Ss. Notice that at all 
frequencies. for both right and left 








10 JOHN F. CORSO AND 


ears the average intra-S variability 
is smaller than the average inter-S 
variability. 


Discussion 


The inter-S threshold variability data 
of the present study are in excellent 
agreement with the findings of previous 
investigators (4, 7, 15, 16). In general, 
inter-S variability tends to be minimal 
in the region of 1000 cps and increases 
as the frequency departs towards either 
end of the audible frequency range. 
The effect, however, is considerably 
more marked towards the higher than 
towards the lower audiometric fre- 
quencies. 

Comparative data on intra-S vari- 
ability are considerably more limited. 
A study by Harris and Myers (8) 
indicates that for frequencies of 256, 
1024, and 8192 cps, the SD’s of 40 
consecutive 4-crossing thresholds in a 
5-day interval ranged from 1.57 to 4.24 
db among 3 Ss. In the present study, 
the average intra-S§ §D for these approxi 
mately same frequencies ranged from 
2.14 to 3.88 db on the right ear, and from 
2.40 to 6.51 db on the left ear. No 
explanation is available for the increased 
range of variability on the left ear. The 
right ear, nevertheless, shows agreement 
to within .5 db with the intra-§ varia- 
bility obtained by Harris and Myers (8). 

The findings of this study support 
the general proposition that a treatment 
by Ss design increases the precision of 
an experiment when the inter-S vari- 
ability is eliminated as a source of error 
(12). Thus, at least for auditory thresh- 
old determinations by the method of 
limits, increased precision in psycho- 
physical experiments can be obtained 
by repeated measurements on the same 
Ss. Furthermore, there was no evidence 
of the major limitation of the treatment 
by Ss design: successive treatments 
administered to the Ss of this study did 
not tend to affect their responses. Ap- 
parently, the effects of the method of 
limits were either of short duration or 
were entirely dissipated during the 


ALEXANDER COHEN 


several days which elapsed between 
experimental sessions. 
EXPERIMENT IT 
Method 
Subjects —The Ss of this study were 10 


3 female) from an under- 


educational 


, i 1 
volunteers (7 male; 


graduate course in psychology. 


The ages ranged from 17 to22 yr. No screening 
techniques were employed in selecting Ss, other 
than S’s statement that he had no known 
auditory impairment 

Experimental conditions.—Two testing ses- 
ions were separated by a period of 24 to 26 
jays. In each sess 10 threshold measure- 
ments were made on the right ear of each S 
at 125 cps. 

Procedure.—The audiometric procedure, 
threshold « putation, and testing chamber 
were identical to those Fxp. Il. The time 

*d for each threshold measurement (four 
i ngs) was approximately 3 min. 





f each measurement S was given 
rest period, but ! 





the carp? 


not removed Threshold mea 


1¢s were 
sur ‘nts an 
iod 1 E had 
10 successive tests. This procedure 


r bot! 


rest per ted unt 
istered 


followed f 


§ were alterna 


experimental sessions 


Results 


Table 2 presents the mean thresh- 
old SPL in db re .0002 dyne/cm*. at 
125 cps for each trial of the two 
experimental sessions, together with 
each SD. In each 
session the mean decreased by ap- 
proximately 4 db from the first to the 
tenth test-trial. Also, the mean value 
for the second session was consistently 
lower than that for the first session 
by about .7 db. Although the SD’s 
did not vary systematically in the 
first session, they tended to decrease 
slightly during the course of the 
second session. 

To test for statistical significance, 
an analysis of variances was _ per- 
formed on the threshold data. The 
analysis indicated that the F-ratios 


corresponding 


obtained for the main effects of 
trials, sessions, and Ss _ were all 
significant beyond the .01 level. Of 





\UDITORY 


FHRESHOLD MEASUREMENTS 


TABLE 2 


p Mean ‘TuHresnoitp VALUES IN DB RE 


0002 Dyxe/Cm.* 


FoR A 125-CPS ‘Tont 


Trial 


Mean 
SD 


Trial 


Mean . 2 93 | 188 
SD 21 


the first-order interactions, only the S 
by Session interaction was significant. 
One-tailed t tests were then made 
on the differences between the com- 
bined the first test-trial 
succeeding test-trial until 
the .O1 
occurred 
Trials 
were 
significance 
between the 
and suc- 

ceeding found that 
Trial 5 significantly from 
Trial 3, but no significant differences 
were found between the mean thresh- 
old for Trial 5 
remaining trials. 


means for 
and each 
a significant difference at 
level This 
in the comparison between 
1 and 3. Additional 
the 
difference 
for Trial 3 
trial. It 


differed 


was obtained.* 
tests 
made to determine 
of the mean 
threshold each 
was 


; 
ana 
ana 


any of the 


Discussion 


The results of this study showed that 
the threshold of hearing for a 125-cps 
tone was significantly lowered by the 
practice effects occurring between testing 
sessions and during the first five trials 
in a series of 10 threshold measurements. 
This finding is in general 
with the work of Zwislocki 


agreement 


(17) who, 


*Although the rence 


sessions was statistically significant, the 


mean between 
values 
for trials across sessions were in close agreement 
and, hence, were combined. In the ¢ tests, the 
Trials & Sessions & S variance was used as the 
error term. 


Combined 


19.8 


4.2 


Combined 


] 


] 
6 


‘4% 
> 


in a study designed to evaluate earmuffs, 
found significant differences between 
the binaural thresholds at 100-cps for 
seven originally naive subjects tested 
by the method of adjustment in three 
sessions one week apart; in this same 
study, it was also found that the thresh- 
old was highest at the beginning of each 
session and decreased rapidly 


during 
the first minute of testing. 


The findings 
of the present study, however, appear to 
contradict the results of Exp. I in which 
no shifts occurred during three audio- 
metric tests and the results of other 
investigators (9, 10, 13) who have 
reported only chance threshold fluctua- 
tions in repeated tests. It is unlikely 
that the discrepancy in results can be 
accounted for in terms of the particular 
test tone which was used in this experi- 
ment, since Harris and Myers (9) and 
Witting and Hughson (16) also used a 
125-cps tone and reported no systematic 
decrease in threshold values for this 
frequency. 

A more plausible explanation concerns 
the methodology and specific experi- 
mental design in which the threshold 
data were collected. In this study, 
a closely-spaced series of threshold 
determinations were made in each ses- 
sion. Previous studies, including Exp. 
1, have allowed periods of an hour, day, 
week, or month to elapse between 
threshold tests. It is conceivable that 
with such intervals, the learning from 
a single test is not retained or, perhaps, 











12 JOHN F. CORSO AND ALEXANDER COHEN 


is modified by extraneous factors occur- 
ring between testing periods. The 2- 
min. rest periods provided between 
trials in the present study may have 
minimized fatigue effects and may have 
permitted the practice effects to accumu- 
late from one threshold trial to the 
next. It should be observed, however, 
that the magnitude of threshold shift 
resulting from practice over 10 trials 
in any one session is approximately only 
4 db. This value is within the limits of 
accuracy of +5 db usually stated for 
threshold determinations in clinical audi- 
ometry (5) and is probably of no practical 
significance. 


SUMMARY 


Two experiments are reported dealing with 
methodological factors and the absolute thresh- 
old of hearing. In Exp. I, 38 Ss were admin- 
istered audiometric tests by the method of 
limits on right and left ears at nine frequencies 
from 250 to 8000 cps in each of three experi- 
mental sessions. The results of this experiment 
indicated that the average inter-S variability 
was greater than the average intra-S variability 
for the frequencies tested. No significant intra- 
S sources of variance were obtained over the 
course of the three tests, but nearly all inter-S 
sources of variance were highly significant. It 
was concluded that the precision of psycho- 
logical experiments involving auditory thresh- 
old measurements may be increased by using 
repeated measures on the same Ss. 

In Exp. II, the right ears of 10 Ss were tested 
10 times by the method of limits at 125 cps 
in each of two testing sessions separated by 
24 to 26 days. The results indicated that the 
mean threshold values decreased about 4 db 
from Trial 1 to Trial 10 in each session. The 
mean threshold for the second session was about 
.7 db lower than that for the first. It was 
concluded that practice effects accumulating 
over the first five trials of a testing session may 
significantly decrease the absolute threshold 
for a 125-cps tone. This shift in threshold, 
however, is probably too small to be of any 
practical significance in clinical audiometry. 


REFERENCES 
1. Anon. American standard method for the 
coupler calibration of earphones. 
Stand. Assoc., Dec., 1949. 
2. Crocco, A. Audiometric studies of school 
children. III: Variations in the auditory 


Amer. 


wo 


It 


. Corso, J. F. 


. Gormiey, G. J., & Carre t, J. 


. Sivian, L. J., & Wurre, S. D. 


. Wueever, L. J., & Dickson, E. D. D. 


. Wrrtine, E. 


. Zwistocki, J. 


acuity of 543 school children re-examined 

after an average interval of three years. 

Ann. Otol. Rhinol. Laryngol., 1938, 47, 

926-927. 

The neural quantum theory 

of sensory discrimination. Psychol. Bull., 
1956, 53, 371-393. 

Dapson, R. S., & Kine, J. H. A determina- 
tion of the normal threshold of hearing 
and its relation to the standardization 
of audiometers. J. Laryngol. Otol., 1952, 
66, 366-378. 

\ critical 

on the validity 

Speech 


review of the literature 
and reliability of the audi 
Monogr., 1946, 13, 66-80. 
Harris, J. D. Free voice and pure tone 
audiometry for routine testing of auditory 


gram 





acuity. Arch. Otolaryngol., 1946, 44, 
452-467. 

Harris, J. D. Normal hearing and its 
relation to normal audiometry. Laryn- 
goscope, 1954, 64, 928-957. 

Harris, J. D., & Myers, C. K.  Experi- 


ments on fluctuation of auditory acuity 


U.S.N. Bur. Med. and Surg. Rep. 196, 
Proj. NM 003-041, June, 1952. 

. Harris, J. D., & Myers, C. K. Experi- 
ments on the fluctuation of auditory 


gen. Psych., 1954, 50, 87-109 
Variability of the 


acuity. /. 


Herman, G absolute 


audit threshold—a psychophysical 
study J. acoust. Soc. Amer., 1953, 25, 
R79? 


1 , 
the hear 


Lirscuitz, S. Fluctuation of ing 
Imer., 1939, 


threshold. J. acoust. So 
11, 118-121. 

Linpquist, EF. F. Design and analysis of 
experiments. New York: Houghton- 
Mifflin, 1953 

Munson, W. A., & Wiener, F. M. 
measurements for psychophyical 
J. acoust. Soc. Amer., 


Sound 
tests. 
1950, 22, 382-386. 
On minimum 
audible sound fields. Jj. acoust. Soc. 
Amer., 1933, 4, 288-321. 

The 
determination of the threshold of hearing. 
Jour. Laryngol. Otol., 1952, 66, 379-395, 
G., & Hucuson, W. Inherent 
accuracy of a series of repeated clinical 
audiograms. 1940, 50, 
259-269. 


Laryngoscope, 


Design and testing of 
earmutts. P33 acoust. Soc. Amer., 1955, 


27, 1154-1163. 


(Received November 20, 1956) 








Journal of Experimental Psychology 
Vol. 55, No. 1, 1958 


AN ANALYSIS OF POSITIONING MOVEMENTS AND 
STATIC REACTIONS! 


EDWIN A. FLEISHMAN 


Yale University 


This paper describes another in a 
series of factorial studies concerned 
with the isolation and definition of 
dimensions of psychomotor abilities. 
One aim of these studies is to provide 
a functional classification of abilities 
adequate to account for individual 
differences in a wide range of psycho- 
motor performances. It is felt that 
these findings are relevant for general 
experimental psychology and may 
allow more dependable generaliza- 
tions of research results from one task 
to another, where the tasks are defined 
in terms of more functional ability 
dimensions. 

One of the first studies sought to 
clarify in a single study (3) a wide 


range of factors identified by previous 
researches in this area (for review of 


these see 2). Subsequent studies 
have examined the areas of dexterity 
(4), gross physical proficiency and 
fine manipulative skill (7), and more 
complex psychomotor performances— 
e.g., tracking tasks, discrimination 
reaction tasks, tasks requiring co- 
ordinate movements (5). In general, 
the results suggest that while some 
of the factors isolated are narrow in 
scope, most of them are broader than 


1 This study was carried out while the writer 
was with the Air Force Personnel and Training 
Research Center. The work was done under 
ARDC Project No. 7707, in support of the 
research and development program of the Air 
Force Personnel and Training Research Center, 
Lackland Air Force Base, Texas. Permission 
is granted for reproduction, translation, publi- 
cation, use, and disposal in whole and in part 
by or for the United States Government. The 
writer gratefully acknowledges the assistance of 
Walter E. Hempel, Jr., throughout the conduct 
of this study. 


13 


was formerly suspected and contrib- 
ute to- skill on many superficially 
diverse kinds of tasks. Moreover, a 
relatively limited number of meaning- 
ful factors now can be used to describe 
performance in the large variety of 
tasks investigated. 

The primary approach in these 
studies is (a) to construct batteries 
of psychomotor tasks designed with 
a view to certain hypotheses about 
the abilities contributing to per- 
formance, (b) administer the battery 
to a large sample of Ss, and (c) sub- 
ject the intercorrelations obtained to 
factor analysis study. One source of 
hypotheses regarding the important 
dimensions of psychomotor skill comes 
from the rational classifications util- 
ized in general experimental psy- 
chology or in human _ engineering 
research. Such classifications may 
serve as a Starting point for the design 
of factor analysis studies aimed at 
testing empirically how such cate- 
gories hold up functionally from the 
point of view of individual differences. 

One such classification from the 
point of view of equipment design 
has been provided by Brown and 
Jenkins (1). These investigators 
classified motor abilities into (a) 
static reactions, (b) positioning re- 
actions, and (c) movement reactions. 
Each of these general categories are 
then subdivided into other, more 
restricted, motor response categories 
(e.g. continuous movements, discrete 
movements, etc.). The present study 
is oriented to a great extent around 
the classification system proposed by 
Brown and Jenkins. 














14 EDWIN A. FLEISHMAN 


Specifically, the present investi- 
gation is concerned with the areas ot 
positioning movements and_ static 
reactions. Positioning reactions are 
those in which the body members 
must be moved to a specified position 
in space. In this class of skills, the 
terminal accuracy of the response is 
the primary feature, as distinguished 
from movement reactions where skill 
during the response is of interest. 
Thus, in positioning movements one 
is more interested in the ability to 
localize one’s, movements _ inde- 
pendently of visual cues (e.g. as in 
reaching to a given position in space, 
or moving a control a certain exact 
distance). In movement reactions, on 
the other hand, one would be inter- 
ested in such things as the ability to 
make smooth or coordinated control 
movements, to move a body member 
or control at a given rate, in a 
rythmical fashion, or along specific 
pathways. Static reactions include 
instances where a body member is 
held for a time in a fixed position in 
space, where the maintenance of this 
position is the central task. This 
class of skills probably represents the 
least important class of skills from 
the point of view of equipment 
design, but the relationships of such 
skills to more complex performances 
are largely unknown. 

It should be stressed that in recent 
years there has been an increasing 
amount of research on environmental 
and procedural factors which affect 
the acquisition, maintenance, or dec- 
rement of such skills. However, in 
these studies interest is primarily in 
a single task studied intensively over 
repeated trials. In the present line 
of research, primary interest is in the 
initial abilities of individuals to per- 
form a variety of different tasks, in 
the interrelationships among these 
performances, and in the more precise 


delineation of common and _ specific 
abilities required to perform these 
tasks. 

The specific objectives of the pres- 
ent study were (a) to explore the 
organization of wide 
range of tasks requiring positioning 
movements and static reactions, (b) 
to determine the relationships be- 
tween abilities in 
(c) to determine the kinds of tasks 
that best measure these abilities, and 
(d) to evaluate the utility of certain 
printed tests in reproducing variance 
in apparatus tests of these skills. 


abilities in a 


these two areas, 


PROCEDURE 


Development of the 
Tasks 

\ battery of experimental tasks was con 

structed, designed to provide for the following 

variations of positioning movements: (a) dis- 

crimination 


Experimental 


f the extent of movement required, 


(b) discrimination of direction of movement, 


(c) reproduction of prescribed movements to 
/) initiation or estimatior 
vements, (¢ 


prescribed positions, 
of prescribed m variations of linear 
versus nonlinear movements, (f 
ments, 
movements, 
to make the 
Variations in the 
Practically al! the 
without the use of vision during the measure 
ment trials. In general, this was preceded by 
preliminary without vision, but with 
immediate knowledge of results provided by 
allowing S to 
trial. No knowledge of results was provided 
during the test trials 

structions was for S to 
position or the “feel” 
movement. | 
movement 


control move 


reaching movements, and thrusting 
ntrol used 


and (h 


Variations in the c 


prescribed movement, 





plane -of the movement 


tasks were administered 


practice 
inspect his error after each practice 


Emphasis in the in 
“remember” a given 
of his arm in making the 
the most part, the extent of 
required was held constant in tasks 
requiring discrimination of direction of move 
ment and direction was held constant where the 
task was to judge extent. 

Among the static reactions tasks included in 
the battery were tests of arm-hand steadiness, 
tremor, and limb drift. 


Task Descriptions 


Brief descriptions of the task variables and 
administrative conditions follow. Unless other 
wise specified, there were no time limits. 








POSITIONING MOVEMENTS AND STATIC REACTIONS 








FIG. 1. TRACK TRACING 














FIG. 3. ARM TREMOR 














FIG 5. ARM DRIFT 














FIG.7. STICK REACTION 


l. Track Tracing (Fig. 1).—Trace laterally 
from one end of an irregular slot to the other 
with a T-shaped stylus inserted at arm's length, 
without touching any part of the slot pathway, 


and then retrace. Errors are recorded each 
time any part of the stylus touches the top, 
bottom, or back of the slot. Score is the 
number of contacts in four trials. 

2. Steadine Precision (Fig. 2).—Move a 
long stylus forward, at slightly below shoulder 




















FIG.4. HEX NUT STACKING 














FIG.6. POSITION FINDING 














FiG.8. TARGET LOCATION 


height and at arm’s length, slowly and steadily 
away from the body through a long cylindrical 
pathway. Withdraw, again avoiding hitting 
the sides. Score is the number of contacts in 
four trials. 

3. Steadiness Tremor (Fig. 2).—Same ap 
paratus as No. 2 above, except S merely holds 
the stylus steadily in the opening trying to 
avoid hitting the sides. Score is the number of 
contacts in four l-min. trials. 











16 EDWIN A. FLEISHMAN 


4. Arm Tremor (Fig. 3).—Stand, holding a 
rod at arm’s length in as level a position as 
possible. Score is the maximum deviation from 
the level position during each trial for four 1-min. 
trials. 

5. Hex Nut Stacking (Fig. 4).—Using onl) 
one hand, stack:a series of small (,% in.) hex 
agonal nuts, one on top of another, on their 
smallest edges, four nuts to a stack. 
the number of nuts stacked correctly in two 
30-sec. trials. 

6. Arm Drift (Fig. 5).—Grasp a handle at 
arm’s length and hold it in the exact position 
indicated for the duration of the test period 
The handle is attached through a fulcrum to a 
long rod so that there is upward pressure on the 
handle. The rod contains a pointer which 
indicates the amount of drift from the 
(level) position. Score is the maximum devi 
ation allowed during the test period of four 


2-min. trials. 


Se re is 


Zero 


7. Position Finding (Fig. 6).—Reach out and 
grasp a certain correct peg out of a semicircular 
arrangement of 71 pegs and then return the hand 
to a standard position. The S is seated in the 
center and all of the pegs are white except for 
three black pegs, one directly in front of him, 
and one each in the right and left quadrants 
The S first inspects the peg arrangement and 
receives preliminary reaching for 
the black pegs with the use of vision. Under 
test conditions, S wears black goggles and re 
verbal commands to either the 
center, right, or left black pegs and 
Order of presentation is fixed for each S, the 


practice in 


ceives reach 


return 


having been determined 
Score is the cumulated error in terms of the 
number of pegs deviation to the right or left of 
the correct peg on each attempt in 20 trials. 

8. Stick Reaction (Fig. 7).—Move a control 
stick into one of 21 possible slots arranged 
around a semicircular panel. The stick and the 
slot panel are hidden from S by a shield and 
curtain. A panel of 21 lights above the ap 
paratus indicates the slot to which the stick 
must be moved. The S receives preliminary 
trials to acquaint him with the relative position 
of the slots (e.g. Lights 1 and 21 indicate the 
left and right end slots respectively, and Light 
11 indicates the center slot, etc.). After each 
attempt, the spring loaded stick returns to 
center. Score is recorded electrically as the 
cumulated error in terms of number of slots 
deviation from the correct -slot on each attempt 
in 14 trials. 

9. Target Location—Front (Fig. 8).—Reach 
out with a wax pencil and mark as close to the 
bull’s eye of a series of wall targets as possible. 
The S is seated on a stool in front of a panel 


sequence randomly. 


J wer center target, lower left target, etc.). 


containing six targets arranged in two rows of 
three each. He receives visual practice reaching 
the center of each target, and then receives a 
series of test trials while wearing black goggles. 
He makes one thrust at each target in response 
to verbal instructions (e.g. upper right target, 
The 
S returns his hand to his lap between attempts 
Score 
center of each target in 12 trials 

10. Target Location—Side (Vig. &). 
No. 9 except the targets are located to the side 
of Sin 12 trials. 

ll. Forward Stick Positioning (Fig. 9 
Reproduce the extent of movement, forward 


is the cumulated radial error from the 


Same as 


and backward, of a stick control. The control 
utilized is the stick component of the standard 


Complex C Test (8). I test 


differs from the other extent judgment tests in 





ordination 





that in this test the stick movement is not 
artificially limited to one dimension, but may 
be moved laterally out of the to-and-from | lane 


unless S controls this. Score is the c oN ane 
error from the target positions in 12 trials 

12. Lateral Stick Positie f 
as No. I] except the extent of n 
repr dluced is in the side-to-side 


Same 


vement to be 





plane, with 
irrelevant free play also in the “to-and-from” 
I lane in 12 trials 
13. Lever Posit (Fig. 10 

long lever, held at length, to 
designated positions. The S 
folded trials 
knowledge of results. 


Move a 
several 
blind- 
visual 
During the test the lever 
ition while S’s hand 
then 


ning 
arm’s 
receives 
practice with immediate 
is displaced to a target po 
the lever. He 
lever to 


must return the 
a center groove and reproduce the 
position of the lever. He trials 


where the movement is laterally in front and 12 


grasps 
receives 12 
arm is outstretched to the side 


front and Score is 
terms of from the 


trials where the 
and movement ‘is back. 
cumulated error in inches 
target position in 24 trials. 


14. Direction Tracing (Fi 
blindfolded, draw a line, radialls 


11).—While 


away fr 





mm the 
body across a board to a designated point on 
the edge of the board. Before each trial, the 
pencil is placed at a position on the edge of the 
board. The S then lifts the pencil back toward 
his body to a center peg and, on command, 
moves the pencil across the board to this same 
position. Score is the cumulated error in units 
deviation from the target positions in 14 trials. 

15. Rotary Positioning R (Fig. 12).—Repro 
duce the exact extent of a rotary movement. 
The S grasps a knob fastened to a pointer which 
pivots from the center of a circular dial. The 
knob is fastened to the pointer by a washer- 
collar arrangement which permits S’s grasp to 





POSITIONING MOVEMENTS AND STATIC REACTIONS 























FIG. 11. DIRECTION TRACING 




















FiG. 15. CONTROL MOVEMENT - E 


remain unchanged as his arm moves around the 
dial. After practice with eyes shut and im- 
mediate visual knowledge of results, S is blind- 
folded and given a series of trials in which the 
pointer is moved to certain positions, held there 
for 5 sec. and then returned to a stop at the 
bottom of the dial. The S attempts to repro- 
duce the movement to the target position. 
Score is cumulated error in terms of total degrees 














FIG.10. LEVER POSITIONING 














FiG.12. ROTARY POSITIONING 




















FIG. 16. TARGET AIMING 


of deviation from the correct positions in 10 
trials. 

16. Control Movement R (Fig. 13).—Move a 
handle, against friction, through certain desig- 


nated distances. A stiff arm movement is 
required for S to push the control away from his 
body, as the control extends at right angles to 
his body at arm’s length and by his right side. 
After preliminary blindfolded trials, with visual 








18 EDWIN A. FLEISHMAN 


inspection of error, S grasps the control. The 
E then moves the control to a target position 
where S’s hand remains for 5 sec. The S is then 
required to push the control slowly back to the 
end stop and then pull it back exactly to the 
target position. Score is cumulated error in 
inches for 10 trials. 

17. Knob Positioning R (Fig. 14).—Move a 
knob from side to side through its slot to certain 
designated positions. The knob is mounted so 
it can be moved very easily through a groove 36 
in. long. The apparatus sits on a table with 
the knob up. After preliminary trials with 
knowledge of results, the seated and blindfolded 
S places his hand on the knob. The S grasps 
the knob and E then moves the knob to a target 
position where S’s hand remains for 5 sec. The 
S is then required to move the knob slowly back 
to the end stop and then return exactly to the 
target position. Score is cumulated error in 
inches for 10 trials. 

18. Control Movement E (Fig. 15).—Same 
apparatus as No. 16 above, except it is mounted 
in an upright position. The S is required to 
move against friction in an up-and-down move 
ment; this time he must return his arm halfway, 
back to the target position in 10 trials. 

19. Knob Positioning FE (Fig. 14). 
apparatus and scoring as No. 17. However, 
after the knob and S’s hand are moved to the 
target position, he moves to the end stop and 
estimates a return position halfway back to the 
target position in 10 trials. 

20. Target Aiming (Fig. 16).—Strike a series 
of metal targets through holes in an upright 
panel as accurately as possible with a stylus. 
The S is allowed to strike each of the six targets 
with the use of vision. Practice is then given 
with the eyes shut and S inspects his error after 
each thrust. Test trials are given without the 
use of vision. After each thrust S returns the 
stylus to a groove in the center of the panel. 
Score is cumulated for 24 trials in the following 
fashion: 2 points for striking the target, 1 for 
striking in the hole around the target, and 0 for 
striking the panel around the hole. 

21. Rotary Positioning E (Fig. 12).—Same 
apparatus and scoring as No. 15, except S is 
told to move the pointer from the standard 
position to certain estimated positions (e.g. 
pointed directly to the top of the circle, or 
directly to the left or right) in 10 trials. 

22. Line Drawing.—Draw a series of straight 
lines of certain specified lengths. The S is 
given an L-shaped rule and while blindfolded 
starts drawing from the inside corner of the L. 
The rule has a dent exactly 1 in. from the 
starting point and as S starts drawing he can 
feel when he reaches this dent. He is told 


-Same 


verbally to draw a line four times this distance, 
and in repeated trials he must draw lines 7, 5, 
3 in., etc., in length. Score is the cumulated 
error in terms of inches deviation from the 
correct lengths in 12 trials. 

23. Length Judgment.—This is a printed test 
containing a series of parallel lines of equal 
length. The S is required to make a mark on 
each line, a specified number of inches from its 
left end. This is indicated by a number to the 
left of each line. To aid S in his judgment, a 
reference line 1 in. long is shown in the upper 
right. Score is the total inches deviation from 
the correct answers in 20 items. 

24. Estimation of Length BP631BX (6) 
This is a printed test containing a series of items 
in which S is shown a stimulus line of a certain 
length. He must then choose from a series of 
alternative lines drawn at various angles, the 
one line equal in length to the stimulus line. 
Score is the total number of items correct. 
3-min. parts (60 items each). 


Two 


Administration of the Tasks 


The received considerable 
pretesting to determine optimum administrative 
conditions and to ascertain the difficulties and 
reliabilities of ‘the individual measures. In 
certain cases, multiple scores from the same task 
were evaluated. For example, for the Arm 
Drift and Arm Tremor Tests, maximum devi- 
ation as well as “final deviation” scores were 
evaluated. It was found that for each task 
these two scores correlated approximately .90 
with each other, but that the “maximum devi- 
ation” scores possessed the higher reliabilities. 
Consequently only this score was used in the 
subsequent battery. Similarly, the counter 
score was found more reliable than a clock score 
for the Steadiness Precision Test 

The complete battery was administered to 
200 basic trainee airmen at Lackland Air Force 
Base. To facilitate scheduling, a rotational 
procedure was used in which the order of 
occurrence of each task in the series was fixed, 
but different Ss started at different points in the 
series. Each task was administered to one S at 
a time. 

Table 1 presents the means, medians, SD’s 
and corrected odd-even trial reliabilities of the 
tasks as administered in the final battery. It 
will be recalled that for all these tasks, except 
Hex Nut Stacking, Target Aiming, and Esti- 
mation of Length, a high score indicates poor 
performance. 

Although several of the tasks have relatively 
low reliabilities, most are generally reliable 
enough to permit inclusion in a factor analysis 
battery. Variables 11, 13, and 15, which had 


individual tasks 

















Test Median 
1. Track Tracing 55 
2. Steadiness Precision 245 
3. Steadiness Tremor 170 
4. Arm Tremor 65 
5. Hex Nut Stacking 23 
6. Arm Drift 47 
7. Position Finding 37 
8. Stick Reaction 41 
9. Target Location—F ront 129 
10. Target Location—Side 145 
11. Forward Stick Positioning ) 
12. Lateral Stick Positioning 49 
13. Lever Positioning 5 
14. Direction Tracing 3] 
15. Rotary Positioning R 63 
16. Control Movement R 82 
17. Knob Positioning R 4 
18. Control Movement | 77 
19. Knob Positioning } x2 
20. Target Aiming 17 
21. Rotary Positioning I 71 
22. Line Drawing 51 
23. Length Judgment te 
24. Estimation of Lengt 35 


* Correlation of odd with even 


triale corrected to 





unacceptably low 


from further 
scores for most of the 


reliabilities, were dro 
The 


task 


PI 
analysis. 
variables were 
red from 
rrespondence of the means and 
relative to the SD’s) in Table 1 


generally symmetrical as might be infer 
the close « 


RESULTS 


Pearson product-moment  corre- 
lations were computed among the 21 
task variables remaining after ex- 
clusion of Variables 11, 13, and 15. 
To avoid negative correlations which 
were purely a function of scoring 
procedure, signs were reflected in the 
case of those variables in which a low 
score is indicative of good performance 
e.g. low error counts, low deviations). 


The correlations obtained are con- 
tained in Table 2. 
It is immediately apparent that 


most of the relationships are insig- 
nificant. Although performances on 
these tasks show a high degree of 
self-correlation (as evidenced by in- 


POSITIONING MOVEMENTS 


TABLE 1 


Means, Mepians, SD's anp Reviapitirries oF THE Test VARIABLES 


full length of test by 


distributions of 








AND STATIC REACTIONS 








Mean SD Reliability* 
58.7 28.7 # 
266.6 114.0 RR 
181.9 74.8 86 
66.4 lk.4 67 
22.2 4.2 68 
57.4 31.6 cet) 
394 15.6 WwW 
40.0 10.4 70 
126.1 32.9 86 
143.0 32.5 6 
90 4.0 50 
8.8 4.1 74 
54.3 14.9 58 
33.1 "13.5 72 
65.6 20.4 50 
87.5 30.9 67 
IRD 37.6 66 
7S 27.4 70 
x99 9.9 og 
17.1 10.5 g] 
72.9 22.4 70 
§2.2 23.9 87 
53.4 33.8 4 
35.4 10.8 77 






the Spearman- Brown formula 











ternal consistency reliabilities), the 
skills required are highly specific to 
the individual tasks. The finding 
that many of the variables show not 
one significant correlation with any 
other variable precludes a factor 
analysis of the entire matrix. The 
finding of a high degree of specificity 
in the range of tasks investigated is, 
of course, an important finding in its 


own right. However, in order to 
delineate more precisely whatever 
small amount of common variance 


does exist in measures of this type, a 
limited factor analysis was carried 
out on 13 variables. Each of these, 
except Variable 6, showed at least 
two correlations statistically signifi- 
cant at the .01 level. 

Three factors were extracted from 
this reduced matrix using Thurstone’s 
Centroid Procedure (9), and the axes 
defined by these factors were rotated 
orthogonally to 


simple structure. 








"ISHMAN 


A. FLE 


DWIN 


I 


20 


OO 
TO ts 

fa) ov 60 
TO 60 t 


Il sO 80 
tO t cl 
TO sO WwW 
wl ol © 


Oo 

1 lo 

1 60) 10 

sO 6l tO | NZ 

sO | Zt | Ol | ST | 7s 

Zt} it] €0) €2 | 8¢ | &F 
t 9 s t t z 





“Pap MUO STRUIIIAC] » 





y3Sua'] jo uoNnewNsy 
yuawapn{ yywua"] 
SUIMEIC] OUI] 

‘"q{ Juluonisog Aivjoy 
Suuiy ysse] 

Yq Suruontisog qouy 

|  JUsWIBAOTY JOIWUOTD 
| Y Juruonisog qouy 
Y Mowesaropy jomuoy 
Suldel fp, UO IaIIC(] 

Suwon !sod YING [e192e"} 
IPIG—uUoNLdI0'] WF], 

| JUuOLJ—UONLIO'T J9TIE | 
UONIPIY yPNs 

| Sulpuly UoNsog 
yu wy 
| Suryorig NN Xp] 
Jowal yp, Wy 

JOWDI][, SS9UIPRIIS 
UOISIIII SSIUIPPII 
Sulovi yp, yori], 








| aqeuey 





gSNOLLYTAMMOONTLN] NSW, 


t WAV. 













I II 
1. Track Tracing | 49 | — 37 
2. Steadiness Precision 41 — 39 
3. Steadiness Tremor 55 | — 33 
4. Arm Tremor 34 | —14 
5. Arm Drift 30 — 12 
6. Stick Positioning 11 O4+ 
7. Direction Tracing 27 35 
8. Rotary Positioning F. 31 19 
9. Control Movement R 25 30 
10. Knob Positioning R 34 17 
11. Control Movement FE 30 23 
12. Knob Positioning E 45 18 
13. Target Aiming 30 — 17 


* Decimals omitted. 
Table 3 lists the variables included 
in this limited analysis, and presents 
the centroid factor loadings and the 
rotated factor loadings. Only two 
rotations were required to achieve 
this rotated solution. These rather 
compelling rotations have resulted in 
three readily interpretable factors. 
Factor I is confined to those tasks 
originally designed to measure static 
reactions. Factors II and III split 
into two categories those tests de- 
signed to measure positioning re- 
actions. One of these (Factor II) is 
confined to those tasks requiring S to 
his 


pe ySitic ns, 


estimated 

experi- 
enced; the other factor (Factor III) 
involves only those tasks requiring S 


move arm to certain 


not immediately 


to return his arm to a position from 
which it was just removed. 

be examined 
more closely; loadings above .25 are 


These factors will 
listed in order for each of them. 
Factor I has principal loadings in 
tasks which have identified a factor 
called Arm-Hand Steadiness in previ- 
ous studies (3, 7) and defined as the 
ability to make precise, steady arm- 


POSITIONING MOVEMENTS AND STATIC REACTIONS 


TABLE 3 
Centroip anp Rotatep Factor Loapinecs 


| Centroid Loadings* | Rotated Loadings* 














ll 1 | mu | am 
— 10 : @i-@J 
— OX 56 02 — 09 ‘| 3 
1 | 02 14 42 
— 36 CO 15 00 16 
28 3 — O8 28 18 
30) 05 — 09 30 10 
7 - (2 26 39 22 
— 30 11 45 — 05 22 
2 00 25 32 17 
17 15 19 34 17 
— 32 OS 48 — 05 24 
— 13 23 42 13 25 
— 14 34 12 — O08 14 








hand movements of the type which 


minimize strength and speed. 


Task Loading 
3. Steadiness-Tremor 63 
1. Track-Tracing 61 
2. Steadiness-Precision 56 
4. Arm Tremor 36 
13. Modified Target Aiming 34 
5. Arm Drift 31 


The present results indicate that this 
factor extends to tasks requiring 
maintenance of a steady arm position 
in space where the crucial feature is a 
minimum of tremor. For tasks re- 
quiring steadiness during an arm 
movement, the factor appears inde- 
pendent of the plane of movement, 
whether a control or.just the arm 
alone is moved, or whether the arm 
is extended fully or not. It appears 
best measured by tasks which allow 
a record of the most minute tremors. 
For example, the three tasks with 
highest loadings provide electrical 
recording of the slightest tremor, 
while the Arm Tremor and Arm Drift 
tasks, which possessed somewhat 
lower loadings, provide scoring of 
more gross deviations only. The 
finding that the Arm Drift task has a 
slight loading on this factor suggests 














22 EDWIN A. FLEISHMAN 


that this factor may involve not only 
skill in avoiding relatively minute, 
high-frequency tremor movement, but 
also skill in avoiding large, slow drifts 
of limb position. 

Factor II is identified by positioning 
tasks but only those requiring the 
movement of the limb to a specified 
position, where this position must be 
estimated rather than reproduced from 
an immediately experienced limb posi- 
tion. 


Task Loading 
11. Control Movement E 48 
8. Rotary Positioning E 45 
12. Knob Positioning E 42 


The factor is not confined to linear 
movements, as evidenced by the 
loading of Rotary Positioning. These 
tasks also have in common the re- 
quirement of estimation of extent, 
rather than direction. However, ex- 
tent judgments also appear in Factor 
III, so this is not the unique feature 
of this factor. Since only three 
variables define this factor, however, 
labeling of this factor Position Esti- 
mation is highly tentative. 

Factor III is confined to those tests 
designed to measure positioning move- 
ments, but only to those in which 
reproduction of a specified movement 
is required. 


Task Loading 


7. Direction Tracing 39 
10. Knob Positioning R 34 
9, Control Movement R 32 
6. Stick Positioning 30 


The common feature of all of these 
tasks is that S’s arm (or the control 
on which his arm rests) is moved by 
E to a, given position and then re- 
moved with the requirement that S 
return his arm (or control) to this 
exact position. Moreover, among the 
13 variables in the analysis, these are 
the only tasks which impose this 
requirement. Other tasks, even those 
using the same apparatus or materials, 


which do not require exact reproduc- 
tion of the original limb position, do 
not appear on this factor. For the 
present, this factor is labeled Position 
Reproduction. 


Discussion 


Although much of the results of this 
study must be considered negative, 
certain conclusions appear possible. 

First, it has been shown that static 
reactions are usefully considered as a 
separate class of skills from positioning 
movements. There was no overlap of 
factors between these two areas. Tasks 
originally designed to sample. static 
reactions grouped themselves on a factor 
identified previously as  Arm-Hand 
Steadiness (3, 7). Previous research 
has shown that this class of skills also 
is functionally independent of skill in 
movement reaction tasks (3), fine ma- 
nipulative performance (2, 3, 7), and 
gross physical proficiency (7). In addi- 
tion, the present results indicate this 
factor to extend to tasks requiring 
steadiness during a movement, as well 
as to tasks requiring maintenance of a 
steady arm position, and may extend to 
skill in avoiding large, slow drifts of limb 
position. It appears best measured by 
tasks which allow a record of the most 
minute tremors. 

A major finding was the high degree of 
specificity among positioning — tasks. 
Thus, although individuals are consistent 
in the accuracy achieved from trial to 
trial in performing a certain positioning 
task, little prediction can be made from 
one such task to another. For these 
tasks, the specific variance is much 
larger, in general, than the common 
factor variance. As can be seen from 
the communalities in Table 2 for Vari- 
ables 6-13, the percentage of common 
variance ranges only from 10 to 25%. 
(A direct estimate of the specific vari- 
ance for each of these variables is the 
difference between the reliabilities and 
communalities of each task.) 

The small amount of common variance 
found among the positioning tasks was 














described in terms of two sets of common 
requirements clarified through the factor 
analysis procedure. Thus, the crucial 
distinction between two positioning tasks 
(in terms of abilities required) may not 
depend on whether the extent of the 
movement or direction of movement is 
to be discriminated, but rather if both 
tasks require exact reproduction of a 
movement or not. The Position Repro- 
duction factor, for example, was found 
in tasks requiring discrimination of the 
extent of a movement or the direction of 
a movement, in tasks requiring moving a 
control against friction or in moving a 
loose control, and in various planes of 
movement. It is possible some kind of 
“immediate kinesthetic memory” is in- 
volved. Caution is necessary in inter- 
preting these latter two factors as they 
are based on very low correlations where 
the significant ones are only marginally 
so. However, the factor analysis pro- 
cedure has apparently contributed some 
meaning to the pattern of even these low 
correlations, but further exploration 1s 
obviously necessary. 

With regard to another objective of 
the study, it is clear that printed tests 
designed to reproduce positioning vari- 
ance in apparatus tasks failed to do so. 

In general, the results suggest that the 
distinction between static and position- 
ing movements, provided by the original 
rational classification of Brown and 
Jenkins (1), holds up functionally in 
terms of individual difference 
skills. However, the. rational distinc- 
tions among the different kinds of 
positioning movements do not hold up 
functionally. A number of questions 
remain unresolved by the present study. 
For example, none of the tests sampled 
S’s sensitivity to control pressures, either 
in requiring S to reproduce such pres- 
sures or in requiring him to discriminate 


in such 


them. Similarly, tasks requiring more 
“‘pure’’ kinesthetic discriminations were 
not included and the present study 


suggests these may be related to this 
class of skills. The high degree of 
specificity among positioning tasks may 
pose definite limitations on the kinds of 





POSITIONING MOVEMENTS AND STATIC REACTIONS 









23 





generalizations one may make from one 
positioning task to another. It should 
also be noted that these tasks were 
administered without immediate knowl- 
edge of results to S. It is possible that 
a repetition of this type of study in 
which such knowledge of results is 
provided might result in higher corre- 
lations among these tasks. 


SUMMARY 


\ series of tasks was developed and stand- 
ardized for the measurement of skill in the areas 
of positioning movements and static reactions. 
The variety of tasks were developed to conform 
to a previous rational classification of such skills. 
Although there has been some previous experi- 
mental work relating environmental, procedural, 
or learning variables to performance of such 
skills, the present study was concerned with the 
interrelationships among the skills themselves. 

The complete battery of tasks was adminis- 
tered to 200 Ss, the intercorrelations were ob- 
tained, and a factor analysis of certain of these 
performed. The results indicate that skill in 
static reactions is usefully considered a separate 
class of skills from positioning movements. In 
terms of individual differences in such skills, 
performances on static reaction tasks showed a 
substantial degree of common variance. 

On the other hand, the major finding with 
respect to positioning tasks was that under the 
present conditions of administration there was a 
high degree of specificity associated with each 
task. Although Ss showed a high degree of 
consistency on individual tasks, little prediction 
could be made from performance of one such 
task to another. The small amount of common 
variance discovered was confined either to tasks 
requiring estimation of a specified limb position 
not previously experienced, or to other tasks 
requiring immediate reproduction of a previous 
movement or limb position. 


REFERENCES 


1. Brown, J. S.,& Jenxins, W.O. An analysis 
of human motor abilities related to the 
design of equipment and a suggested 
program of research. In P. M. Fitts 
(Ed.), Psychological research on equipment 
design. AAF Aviation Psychology Re- 
search Report No. 19, Washington, D. C.: 
Govt. Print. Off., 1947. 

2. Frersaman, E. A. Testing for psychomotor 
abilities by means of apparatus tests. 
Psychol. Bull., 1953, 50, 241-268. 





24 EDWIN A. FLEISHMAN 


Dimensional analysis of 
psychomotor abilities. J. exp. Psychol., 
1954, 48, 437-454. (Distributed sepa- 
rately as USAF Personnel and Training 
Res. Cent. Res. Bull., 1954, No. 54-15, 
under title: A factorial study of psycho- 
motor abilities.) 

4. Frersuman, E. A., & Hemper, W. E., Jr. A 
factor analysis of dexterity tests. [er- 
sonnel Psychol., 1954, 7, 15-32. 

5. Frersuman, E. A., & Hemper, W. E., Jr. 
Factorial analysis of complex psycho- 
motor performance and related skills. 
J. appl. Psychol., 1956, 40, 96-104. 

6. Guitrorp, J. P., & Lacey, J. I. (Eds.) 
Printed classification tests. AAF Aviation 


3. Freisuman, E. A. 


Psychology Program Research Report 
No. 5. Washington, D. C.: Govt. Print. 
Off., 1947. 

7. Hempen, W. E., Jr., & Freisuman, E. A. A 
factor analysis of physical proficiency and 
manipulative skill. J. appl. Psychol., 
1955, 39, 12-16. 

8. Merton, A. W. (Ed.) Apparatus tests. AAF 
Aviation Psychology Program Research 
Report No. 4. Washington, D. C.: 
Govt. Print. Off., 1947. 

9. Tuurstone, L. L. Multiple-factor analysis. 


Chicago: Univer. Chicago Press, 1947. 


(Received December 3, 1956) 











Journal of Experimental Psychology 
Vol. 55, No. 1, 1958 









AGE DIFFERENCES IN TRANSFER AND RETROACTION AS 


A FUNCTION OF INTERTASK 





This study investigated transfer 
and retroaction in verbal learning 
by young, middle-aged, and old 
adult Ss. As such, it bears upon 
the generalization contained in the 
psychological literature (5, pp. 538- 
540) that older Ss are more susceptible 
to negative transfer, while positive 
transfer increases early in life. 
The chief findings which support this gen 
Ruch (8) who tested 
three groups of 40 Ss each, distributed over the 
CA intervals 12-17, 34-59, and 60-82 yr. 
the following learning tasks: direct-vision rotor, 


eralization are those of 


mirror-vision rotor, paired associates (logically 
connected words nonsense equations, and 


false pr xlucts. The inferi rity f the old 
group to the young group on the verbal ma 
terials increased with the order given above 


The same order f deficit characterized the 
with the 
but with less statistical s 
interpretation of 
advanced the hyp thesis that these 
decrements functions of the amount of 
reorganization of pre existing response patterns 


comparison of the old grou middle 


aged grout 


In an 


igniticance 
Ruch 


obtainéd 


these results, 


were 


that was required. 

However, in a replication of the verbal 
of Ruch’s study, Korchin and 
Basowitz found that with groups of young and 


learning aspects 


old Ss there was “virtually no difference between 
scores on nonsense and interference materials 
and no evidence of a greater degree of deficit 
in the latter than in the former” (4, p. 67). 
rhey also state that procedural differences may 
account for the obtained divergent results. 

Also cited in support of the above generaliza- 
tion is the work of Gilbert (2) who compared 
the retention of a variety of materials in groups 
of Ss of CA 60-69 yr. and of CA 20-29 yr. 
after matching on a vocabulary score. The 

‘This investigation was supported in part 
by a research grant, B-787, from the National 


Institute of Neurological Diseases and Blind- 
ness, U. S. Public Health Service. 
2?Now at the Veterans Administration 


Hospital, Leech Farm Road, Pittsburgh, Pa. 





RESPONSE SIMILARITY! 


MICHAEL GLADIS? AND HARRY W. BRAUN 


University of Pittsburgh 


deficits shown by the older 


group were smallest 
for the visual and auditory memory span for 
digits and largest for the English- 
Turkish paired associates. 


recall of 


If older Ss are more susceptible to negative 
transfer, the possibility suggests itself of demon- 
strating this phenomenon with the retroactive 
nhibition paradigm which permits the deter- 
mination of transfer as well as of retroaction 
effects. Such an attempt was made by Cameron 
(1) who had 12 senile patients with special 
retention defects and a group 
adults (NV unknown) learn a series of three- 
place numbers and then tested them for recall 
after varying periods of time and after inter- 
vening activity which consisted of spelling a 
list of words. No transfer data were reported. 
With the senile Ss, recall was poorer and some- 
times nonexistent after the intervening activity, 
but the normal Ss showed little deficit. 

The present study represents a 
more extensive and systematic in 
vestigation of transfer and RI in 
verbal learning by young, middle- 
aged, and old adult Ss. The original 
learning (OL) of a set of paired 
associates was followed by four types 
of interpolated learning (IL) in which 
the stimulus members of the paired 
associates were identical to those in 
OL, but in which the response mem- 
bers were varied through four degrees 
of similarity of meaning to the re- 
sponses in OL. This response varia- 
tion paradigm (S; — Ri; S; — Re; 
S;: — R,) has been claimed by Osgood 
to yield “negative transfer and re- 
troactive interference . . ., the mag- 
nitude of both decreasing as similarity 
between the responses increases”’ (7, 


p. 527). 


of normal young 


MetTHop 


‘ 
Design.—Ten Ss from each of three age 


groups, 20-29, 40-49, and 60-72 yr., were 











26 MICHAEL GLADIS AND HARRY W. BRAUN 


TABLE 1 


LEARNING MATERIALS 








Responses 


Stimuli a. 
OL ——— — —— - 
HS MS LS | NS 
TL | INSANE | CRAZY DERANGED | BALMY ORAL 
HX | WINDING | SPIRAL CIRCLING TWISTED QUIET 
WG ROTTEN DECAYED RANCID IMPURE LIQUID 
SN COMPLETE | ENTIRE PERFECT UTTER VULGAR 
KZ DISTANT | FAR-OFF FURTHER REMOVED EAGER 
DM BRUTAL | RUTHLESS HEARTLESS | UNKIND } EXTINCT 
BV STUBBORN HEADSTRONG | MULISH PERVERSE | SHAKY 
FY | HIDDEN | CONCEALED SECRET COVERT | CRYING 


| 
! 


Note.—HS, MS, LS, and NS refer to high, moderate. low, and neutral similarity respectively. 


randomly assigned to one of four experimental 
conditions which represented four variations 
of IL. In OL, all Ss learned the same set of 
eight paired TL-INSANE. 
This was followed for a given S by one of the 
four types of IL in which the stimulus members 
remained identical while the response members 
were varied through four degrees of similarity 
of meaning: high similarity (TL-CRAZY), 
moderate similarity (TL-DERANGED), low 
similarity (TL-BALMY), and neutral similarity 
(TL-ORAL). 

Learning materials.—The learning materials 
appear in Table 1. The stimulus members 
of the eight pairs consisted of two consonants. 
These pairs of consonants were selected so 
that they did not form any well-known ab- 
breviation and were not alphabetically adjacent. 
The stimulus items were so assigned to the 
response words that they did not form a larger 
word when combined and neither member of 
the pair of consonants alphabetically followed 
or preceded the first letter of the response words 
to which they were adjacent. 
words were two-syllable adjectives selected 
from materials prepared by Haagen which 
consist of 400 common two-syllable adjectives 
scaled in terms of similarity of meaning, vivid 
ness, familiarity, and association value (3). 
A preliminary study indicated that a high degree 
of relationship exists between the ratings of 
similarity of meaning given by Haagen’s college 
Ss and similar ratings made by Ss differing in 
chronological age and level of education. This 
finding warrants the use of Haagen’s adjectives 
in verbal learning experiments with Ss similar 
to those employed in this study. 

Procedure.—The paired associates were pre- 
sented on a Hull-type memory drum with the 
rate of 4 sec. for the stimulus followed by the 


associates, €.2., 


The response 


stimulus and the response together for 4 sec 
The intertrial interval was 8 sec. Five dif- 
ferent orders of presentation of each list were 
ised to avoid serial learning. Two minutes 


and IL and between IL 
Each list was learned to 
of one perfect trial, but practice was 
continued for one additional trial beyond this 
criterion. All Ss were given the 
tions prior to OL. 


elapsed between OL 
i 


and recall. the cri 


terion 


usual instruc- 
They were also told that 
they would learn two lists of words in succession, 
but not that they 
the first list. 

Subjects. 


would be asked to relearn 


While all Ss were uncompensated 
only those 
criteria were selected: 


volunteers, who met the following 
(a) low average verbal 
intelligence as measured by the vocabulary 
test of the Wechsler Adult Intelligence Scale, 
(b) freedom from marked emotional or 
physiological disturbance, (c) appropriate chron 

The 120 Ss (40 in each of the 
groups) were either unemployment 
applicants at the Braddock, Pa., 
office of the Pennsylvania State Employment 
Service, or medical or surgical 
patients at the VA Hospital, University Drive, 
Pittsburgh, Pa., or volunteer workers at that 
Hospital.3 


any 


ological age. 
three age 


insurance 


convalescent 


RESULTS 


Vocabulary scores. The mean 
WAIS vocabulary scaled scores of 
the three age groups were: 10.5 


*The authors wish to acknowledge the 
assistance of Dr. Joseph Newman of the VA 
Hospital, University Drive, Pittsburgh, Pa., 


and of Mr. A. Allen Sulcowe of the Pennsylvania 
State Employment Service, Harrisburg, Pa. 





\GE DIFFERENCES IN TRANSFER AND RETROACTION 


TABLE 2 


Raw Means or OL, IL, ann Recauy 


Recall 


Variable 


(Items) 


11.7 (middle aged), and 

An analysis of variance 
showed significant differences (P 
< .01) due to age. However, dif- 
ferences in vocabulary scores were 
not significant among the groups who 
had undergone the different conditions 
of IL, nor was the interaction of age 
and IL conditions significant. The 


(young), 


12.7 (old). 


mean vocabulary scores for the four 
IL groups were: 11.6 (high similarity), 


11.5 (moderate similarity), 11.9 (low 
similarity), and 11.6 (neutral simi- 
larity). 

Original learning.—The mean trials 
to the criterion of OL for the three 
age groups as well as of the four 
similarity subgroups are presented 
in Table 2. Transformed (reciprocal) 
means as well as OL means adjusted 
for differences in vocabulary appear 


TABLE 3 


TRANSFORMED MEANS AND ApjusTED 
Means or OL, IL, anp Recaty 


Variable | = 
| Adj. | Adj. 
| M | MésséifFSiM 
Age (yr.) 
20-29 
10-49 
60-72 


| 79.98 
| 59.65 
53.30 


85.78 
59.18 
48.10 


116.88 
80.88 | 
72.30 


112.23 
83.48 


Similarity | 
igh 1} 63.17 
Moderate | 63.00 | 63.47 | 
Low | 66.17 | 64.75 
Neutral 64.90 64.90. 


63.17 99 67 


86.73 

84.53 | 

89.13 
| 


in Table 3. Since significant differ- 
ences were obtained among the age 
groups in vocabulary level, an analysis 
of covariance was carried out in which 
the OL mean squares were adjusted 
for differences in vocabulary. Recip- 
rocals of trials to criterion K 1000 
were used as OL and IL scores in 
the analyses. A summary of this 
analysis appears in Table 4. Dif- 
ferences in age-group means in OL 
are significant (P < .01), but the 
F values for the various conditions 
of IL and the interaction are not 
significant. Transformation of OL 
as well as of IL failed to 
remove heterogeneity of variance. 
A significant (P < .05) product-mo- 
ment correlation coefficient was ob- 
tained between vocabulary 
and OL trials (Table 5). 


scores 


scores 


TABLE 4 


\watysis oF Covariance: OL, IL, anp RecaLei 


Age (A) 12,423 
Similarity (S 3 31 
AxS 211 
Within 847 
Total . 


IL 


| 
10,664 
1,607 
| 1,283 | 
| 1,726 


106 and 117 for IL, and 105 and 116 for Recall 





MICHAEL GLADIS AND HARRY W. BRAUN 


TABLE 5 


r’s BETWEEN THE VARIOUS MEASURES 


Recall 


1 


IL 
Recall 


Note.—for 117 df, anr > .18 is significantly different 
from zero at the .05 level. 


Interpolated learning.—Table 2 also 
contains the raw mean trials to the 
criterion of IL of the three age groups 
and of the four similarity subgroups. 
Transformed (reciprocal) means and 
means adjusted for differences in 
vocabulary and OL appear in Table 3. 
The rate of IL tended to decrease 
with increasing age and with decreas- 
ing response similarity between OL 
and IL. An inspection of these 
data indicates that positive transfer 
occurred. To determine whether a 
significant amount of transfer took 
place, a multiple classification analysis 
of covariance was performed in which 
the IL mean squares were adjusted 
for differences in rate of OL and of 
vocabulary level. The summary of 
this analysis (Table +4) indicates 
that significant differences (P < .01) 
in amount of transfer did occur among 
the age groups but that there were 
insignificant differences among the 
various similarity conditions. The 
interaction of age and similarity of 
meaning was not significant. A more 
detailed analysis of the transfer trends 
showed that the significant differ- 
ences among the age groups was 
primarily due to the greater amount 
of transfer that occurred in the CA 
20-29 yr. group since the difference 
in the amount of transfer between 
the other two older groups was slight. 
The covariance adjustment indicated 
that 36% of the variance in IL could 
be accounted for by the combined 


variables of OL and _ vocabulary 
level. Significant (P < .05) product- 
moment correlation coefficients were 
obtained between vocabulary and IL 
scores (22) and between OL 
IL (.57) (Table 5). 

Recall of original learning.—The 
mean number of items of the original 
list recalled on the first relearning 
trial by each age group and by each 
similarity subgroup appears in Table 
2. Recall decreased 
age and with decreasing response 
similarity. Since significant differ- 
ences had been found among the three 
age groups in vocabulary level and 
rate of both OL and IL, a multiple 
classification analysis of covariance 
was carried out to adjust the mean 
squares of the recall scores for these 
differences. The summary of this 
analysis appears in Table 4. It 
shows that retroactive interference 
(R1) increased significantly (P < .01) 
as the degree of similarity of meaning 
between the response members of 
OL and IL decreased and that no 
reliable differences in RI were ob- 
tained among the three age groups. 
The interaction of age and similarity 
also was not significant. An analysis 
of covariance of relearning scores 
yielded similar results. Table 4 con- 
tains the adjusted recall means, 
recall scores not having been trans- 
formed. With the covariance adjust- 
ment, 33°), of the variance in recall 
could be accounted for by the com- 
bined variables of vocabulary, OL, 
and JL. Correlation coefficients be- 
tween recall scores and the 
variables appear in Table 5. 


and 


with increasing 


other 


Discussion 


The RI results of this study are con- 
sistent with those obtained by Osgood 
(6) and Young (10), both of whom 
found that RI tended to decrease as 
the degree of similarity of meaning 





\GE DIFFERENCES IN TRANSFER AND RETROACTION 29 


between response items of OL and IL 
increased. The variations of IL in 
this study are the same as those of 
Young except that he did not employ 
the condition of “neutral’’ similarity. 
Osgood used as IL words 
which were similar, neutral, and opposed 
to OL responses. 

The finding of positive transfer does 
not support the prediction relative to 
the transfer sign which is claimed for 
the particular RI paradigm employed 
herein. This finding is not unusual, 
however, and is customarily attributed 
to previous learning of the transferred 


responses, 


responses or to “‘Jearning how to learn.” 
With regard to the latter consideration, 
attention is directed to the relatively 
low educational level of the Ss as com- 
pared to college Ss and to the fact that 
their experience in memory drum learn- 
ing was most probably confined to their 
participation in this study. The con- 
tinued significance of the variable 
in IL after analysis of covariance is 
surprising and suggests a fruitful area 
of speculation. 


age 


Such speculation, how- 
ever, should probably await the inde- 
pendent verification of this finding. 
The demonstration of positive transfer 
is not, on the face of it, consistent with 
the generalization that 
more susceptible to negative 
However, the fact that the 


of positive transfer that 
was 


older Ss are 
ransfer. 
amount 
occurred in 
yreater than 
that for the two older groups may be 
adduced as indirect this 
generalization. 

The results of Cameron (1) and Gilbert 
(2) provided some basis for predicting 


the youngest group 


support of 


that RI would increase with increasing 
However, it 
that no 


has been 


significant 


age. 
shown 
in RI 


groups. 


clearly 
differences 
occurred among the three age 
A direct comparison between 
the results of this study and those of 
Cameron and Gilbert is not possible 
since Gilbert did not manipulate inter- 
polated activity while Cameron’s older 
group was composed of senile Ss who 
In addi- 
tion, it is not known if Gilbert matched 


had specific memory defects. 


her groups on the basis of learning 
ability as well as vocabulary level. This 
latter point is important because when 
recall scores in the present study were 
not adjusted for differences in learning 
ability, significant differences in RI 
did occur among the groups, with RI 
increasing with increasing age. 

The demonstration of an_ inverse 
relationship between age and learning 
ability is consistent with results obtained 
by Ruch (8) and Thorndike (9). 

Finally, it is apparent that this study 
measured age differences in transfer 
and retroaction rather than age changes. 


SUMMARY 

Thi ly compared transfer and retro- 
active interference (RI) in middle-aged and 
old Ss h young adit as a function of 
varying the degree of similarity of meaning 
between original and 
while keeping 
constant the stimul is of the 
associates. 


interpolated verbal 


paired 


‘orty Ss in each of three age groups were 

: ye gee a 

20-29, 40-49, and 60-72 yr. Ten Ss 

each group were assigned to one of four 
conditions in 


which the degree of 


similarity 
of meaning between the responses of the original 
and interpolated lists varied from high to neutral. 
Under all nditior Ss 
original list 
TL-INSANE). 


Significant differences in the 


learned the same 


paired-associates (e.g., 
amount of 
curred among the three age 
zroups. RI decreased as the degree 
larity between the response members 
and IL increased. No significant 
in RI were found among the three age groups 
after recall and relearning scores were adjusted 


positive transfer 
of simi- 
of OL 


differences 


for differences in vocabulary level and learning 
ability. 

The results are discussed in terms of their 
application to the hypothesis that older Ss 
are more susceptible to negative transfer and 
retroactive interference. 


REFERENCES 


1. Cameron, D. E. Impairment of the 
retention phase of remembering. 
Psychiat. Quart., 1943, 17, 395-404. 

2. Gitpert, J.G. Memory loss in senescence. 


J. abnorm. soc. Psychol., 1941, 36, 73-86. 





MICHAEL GLADIS AND HARRY W. BRAUN 


Haacen, C. R. Synonymity, vividness, 
familiarity and association value ratings 
of 400 pairs of common adjectives. 
J. Psychol., 1949, 27, 453-463. 

Korcuin, S. J., & Basowrrz, H. Age 

J abnorm 


differences in verbal learning. 
soc. Psych il. 1957, 54, 64-09. 
McGeocu, J. A., & Irion, A. L 


psychology of human learning. 


York: Longmans, Green, 1952. 
. Oscoop, C. E. Meaningful similarity and 
interference in learning. J. exp. Psychol., 


1946, 36, 277-301. 


7. Oscoop, C. E. Method and theory in 


experimental psychology. New York: Ox- 
ford Univer. Press, 1953. 

Rucu, F. L. The differentiative effects 
of age upon human learning. J. gen 
Psychol., 1934, 11, 261-286 

lHoRNDIKE, E.. L., et al. Adult learning 
New York: Macmillan, 1928. 

Younc, R. K. Retroactive and proactive 
effects under varying conditions of 

response similarity. J. exp. Psychol., 


1955, 50, 117-119. 


Received December 4, 1956) 





Journal of Experimental Psychology 
Vol. 55, No. 1, 1958 


A GOALLESS GRADIENT! 


\. C. PEREBOOM? 


Texas Technological College 


A Behavior System (4) was Hull’s 
final attempt to extend his primary 
principles of molar behavior to phe- 
nomena that should 
from those principles. Among such 
phenomena Hull included the classical 
problem of intra-maze learning, the 
rat’s relative performance in various 
parts of a spatial maze. 

Delay of reward is an important 
variable in Hull’s system and he 
uses it to deduce the goal-gradient 
hypothesis. Since distance from the 
incentive correlates with the time 
it takes to reach it, the appropriate 
instrumental responses should pro- 
gressively improve, the closer they 
are to that incentive. 

The straight runway is a simple 
means for testing this assertion. 
Hull (3) had a 40-ft. runway parti- 
tioned off into segments by non- 
retrace doors. Using times to trav- 
erse these segments as his response 
measure, he essentially verified his 
hypothesis, except for the segment 
closest to the goal where a slowing 
down With the same re- 
but allowing re- 
tracing, Crespi (1) obtained similar 
results. 

There are several indications, how- 
ever, that the gradient is not en- 
tirely a function of a spatially local- 
ized reward. (a) With a 20-ft. run- 
Hull obtained a gradient of 
the same form, while-his hypothesis 
predicts the second half of the 40-ft. 
gradient. (b) Hull got the steepest 


be deducible 


occurred. 


sponse measure, 


way, 


by a 
grant from the National Science Foundation. 

2 The indebted to Dr. Keith J. 
Hayes for his invaluable criticisms during the 


preparation of 


‘This research was supported in part 
writer 


is 


this article 


31 


gradient during the early trials and 
Crespi got it with the smallest 
incentive. (c) Drew (2) obtained a 
similar gradient 
tained each 


when food was ob- 

runway segment. 
The present experiments were de- 
signed to determine what 
kind of gradient appears on the first 
trial, the trial before the reinforcement 
principle could operate. 


in 


simply 


EXPERIMENT I 


35 and 
"Ss fr 


than 


ts between 
-ezing 
older 
had they e 
Their 


open 


ver 
schedule. 
been in 


experien an 


‘Id study and this did not include any spat ally 
localized rewards 


The : 


ng, and painted a semigl 


inway was he elevated type, 16 ft 
bl Oy erhead 


SS aCn. 


lighting fixtures 


illumination to 


nearly homogeneous 
The 
were defined by small 
on its Those 
Start 
The 
remaining 15 ft. were subdivided by such mark 


ings into five 3-ft 


the 


of the runway 


nway. various 


sect 


white “darts” painted edge 


from each end defined the 


in. 


end) sections, respectively. 


sections 
the time to leave the first 6 in., 
total running were recorded by 
\ record 


a 


Latency, 


and tume stop 


watch. of sections entered, order of 
ntry, and time in each section, was obtained 
with a type of “interaction” chronograph. 
yus runway adaptation, 
S was placed on the start section (S) and the 
stopwatch started. Upon the first 
section (I) the latency was noted and the first 
of the mograph was depressed. If 
he returned to S this key was released, although 


he chronograph c 


Without any previ 
entering 


Key chr 


ntinued running 


Upon 
re-entering I, it was again depressed and held 
until 
second key 


down he entered II, in which case the 


was depressed, or until he returned 
again to S. All Ss eventually entered the end 
section, from which they were removed, but 
total running time varied from 2 to 15 min. 


The accuracy of the chronograph was checked 





A. C. PEREBOOM 


time spent 
in each section 


time to traverse 
each section 


. 
"nn 


MEAN TIME IN SECONDS 








i tT 
RUNWAY SECTIONS 


1. Intrarunway performance of 
10 rats on their only trial. 


against the stopwatch and found to be satis- 
factory. 


Res ults.—Performance was recorded 
by two methods: Time to traverse 
each section and total time spent in 
each section. Note that time to tra- 
verse means the time from firstentering 
a given section to first entering the 
next section; the two methods would 
give the same data if no retracing 
occurred. Figure 1 shows an orderly 
gradient for the time-spent measure 
but retracing distorts the time-to- 
traverse curve. Not only did Ss 
spend more total time in the earlier 
sections, but they entered them more 
often: The mean numbers of retrac- 
ings from Sections I through V were 
1.2, .6, .5, .1, and .1, respectively, 
and to or through sections S through 
IV were 2.3, 1.3, .7, .2, and .1, respec- 
tively. Thus, Ss retraced more from 
the earlier sections and then usually 
back to the start section. Individual 
Ss support these group trends. 

The performance of individual Ss 
support the time-spent gradient but 


the time-to-traverse curve is ambigu- 
ous in this respect. 
sible for the large mean traversal 
time of Section V. Such gross arti- 
facts do not appear in the mean time- 
spent curve. When retracing is both 
permitted and occurs, the latter 
measure is more representative of the 
individual’s performance. 


One S is respon- 


EXPERIMENT II 


Method.—The same Ss were tested at a later 
Tolman-Honzik elevated T 
runway, no doors or goal box were 
Again only one trial without 
adaptation was given to each S and retracing 
was permitted. 


date on a 
As with the 
present. 


maze 
previous 


Latency to leave the first 6 in 
of the first stem and total time to enter the 
last 6 in. of the last correct arm 
No § Running 
from 2 to 25 min. The main resy 


include those 


| 
were taken. 


was discarded. times varied 
nse measure 


was cul entries and they btained 


while retracing. 

Results —Figure 2 shows the mean 
number of cul entries for each cul 
Again an end-gradient was obtained 
although not as smooth as that found 
on the runway with the comparable 
response measure. 


22 
20 
18 


a 


a a 


iv 


MEAN CUL ENTRIES 
o Poy 


vy > 








°o 


CHOICE-POINTS FROM 
START SECTION 


Fic. 2. Intramaze performance of 
10 rats on their only trial. 





\ GOALLESS GRADIENT 


Discussion 


A strong preference for the earlier 
sections of the runway is indicated by 
two facts: The time-spent gradient and 
the greater number of retracings in and 
to the earlier sections. It is postulated 
that fear initiates both; it keeps § 
from proceeding uninterrupted toward 
the end section. It is also postulated 
that curiosity keeps S trying. Thus, 
with time and repeated false starts 
fear is eliminated from the earlier 
sections, and this loss of fear generalizes 
to the later permitting S§ 
then to proceed through new sections 
with little or no additional retracing. 

The comparable measure on the mul- 
tiple-T maze to time-spent on the run- 
way is number of entries into each cul. 
Here, too, a gradient is obtained. The 
greater irregularity of this gradient 
is most likely due to the greater hetero- 
geneity of its environment, compared 
with that of the runway, and the discrete- 
ness of its response measure. 

Will “goalless” principles explain the 
gradient? If so, then delay of 
reward minor factor in the 
goal gradient. Latent learning experi- 
ments showing the high quality of 
first-trial performance following maze 
exploration appear to support this con 
clusion. 


sections, 


goal 
bec ymies a 


SUMMARY 


The goal gradient hypothesis has led to 
many conflicting studies. The purpose of 
the present experiments was to determine if a 
rat begins his maze learning on Trial 1 with 
more or less equal intra-maze performance. 

Using total time spent in the 6-in. start section 
and in each 3-ft. section of a 16-ft. elevated 
runway as the response measure, a clear end- 
of-runway gradient was obtained. These same 
Ss were given one trial on a Tolman-Honzik 
elevated T retracing was also 
permitted. In number of entries 


each c ul, an 


maze where 
terms of 
into end-of-maze gradient was 
ybtained 

An explanation for these first-trial gradients 
is given in terms of a curiosity-fear conflict 


and the principle of stimulus generalization. 


REFERENCES 
Crespi1, L. P variation of 
incentive and performance in the white 


J. Psychol., 1942, 55, 467- 


(Quantitative 


rat mer 
$17 
Drew, G. C. The speed of 
gradient and its 
gradient. /. 


333-372 


locomotion 
relation to the 


Psychol., 


goal 


1959, 27, 


omnh 
com} 


Hutt, C. L 
gradient 


comp. Psychol., 1934, 17, 392-422. 


The rat’s speed of locomotion 
in the approach to food. / 
Huu, C. L. A behavior system. New 
Haven: Yale Univer. Press, 1952 


(Received December 5, 195¢ 





Journal of Experimental Psychology 
Vol. 55, No. 1, 1958 


TWO-CATEGORY JUDGMENTS OF SEQUENCES OF 
STIMULI OF TWO VALUES! 


CLINTON DE SOTO? 


University of Wisconsin 


In most research on judgments 
using the method of single stimuli, 
the sequence in which stimuli are 
judged is not explicitly treated as 
an independent variable, even though 
the basic experimental operation is 
to present a sequence of stimuli for 
judgment. The probable reason for 
this is that there are x” possible 
sequences involving x _ available 
stimuli (or values of stimuli) and y 
presentations of stimuli, a number 
which rapidly becomes inconveniently 
large as x or y increases. 

The usual procedure is to prepare, 
for each experimental condition, a 
frequency distribution of stimuli (a 
stimulus distribution) and to draw 
successive stimuli randomly from 
this distribution. Thus all judgments 
of stimuli are considered to be relative 
to an unchanging stimulus distribu- 
tion, which becomes the primary 
independent variable. Much experi- 
mental and theoretical work has 
been done with respect to the effects 
of range and shape of such constant 
stimulus distributions on judgments 
(1, 2, 4, 8). One conclusion of this 
work is that Ss quickly center their 
judgment scale on some average 
of the stimulus distribution, which 
is then called their adaptation-level 
(AL). In the case of two-category 
judgments, the probability of a high 
(upper category) judgment is a mono- 
tonic increasing function on the 
stimulus dimension, often an ogive, 


! This article is based on portions of a doc- 
toral dissertation submitted to the University 
of Wisconsin. ‘The supervision and advice of 
J. C. Gilchrist are gratefully acknowledged. 

2 Now at The Johns Hopkins University. 


passing through .50 at the AL. Dis- 
placing the AL displaces the entire 
frequency function on the dimension. 

A few Es have shifted Ss from one 
stimulus distribution to another at 
some point in the sequence of stimuli 
(3, 5, 6, 7, 9). Following a shift, 
Ss recenter their judgment scale 
on the average of the new stimulus 
distribution, although the rate of 
recentering seerms to vary under dif- 
ferent conditions. These studies in- 
dicate that important phenomena may 
be overlooked when judgments are 
regarded as representing a constant 
adjustment to a constant stimulus 
distribution; the complex problem of 
relating judgments directly to se- 
quences of stimuli should perhaps 
be attacked. 

This article reports an effort to 
relate two-category judgments to 
some sequences of stimuli restricted, 
for simplicity, to two values, high 


and low. A_ sequence of stimuli 


of two values is composed of alternat- 
ing runs of the two kinds of stimuli. 
The main concern of the study is 
with Ss’ centerings and recenterings 
of their judgment scales during these 


runs, as manifested in empirical 
probabilities of high judgments for 
the stimuli presented. The general 
hypothesis is that Ss recenter their 
judgment scale, i.e , change their AL, 
by some continual re-averaging of 
the stimuli presented, so that during 
a run of stimuli of the same value 
they will 
judgment 
value. 


recenter their 
this stimulus 
If such a recentering occurs 
during a run of stimuli of the same 


gradually 
scale on 





JUDGMENTS OF SEQUENCES OF STIMULI 35 


value, the probability of a high judg- 
ment for these stimuli will move 
during the run toward .50 from any 
prior level. At the same time, the 
probability of a high judgment for 
stimuli of the other value will show 
a roughly parallel change, approach- 
a level above or below .50 according 
to their direction and distance on 
the stimulus dimension from the 
stimuli in the run. 

In one experiment, sequences were 
presented in which runs of 0, 4, 8, 
or 32 low stimuli were followed by a 
run of 64 high stimuli. In another 
experiment, 32 low and 64 high 
stimuli were presented in a sequence 
with alternating runs of 8 low and 
16 high stimuli, or in a sequence with 
alternating runs of 4 low and 8 high 
stimuli, or in a random sequence 


MetTuHOopD 


The stimuli were brief descriptions of items 
of human behavior. The items were fictional, 
but the Ss were told they were actual behaviors 
of older men being screened for possible mental 
illness. The dimension judged was unusualness, 
rareness, bizarreness of the behavior. Values 
ranging from .§ to 12.5 had been obtained 
previously on this dimension for 488 items by 
use of a method of equal-appearing intervals.* 
From this pool, 32 items of values 4.3 to 5.9 
were taken as low stimuli and 64 items of values 
7.3 to 9.4 were taken as high stimuli for the 
present experiments.£ Each of these two 
classes of stimuli includes a large enough seg- 
ment of the dimension that it might be con- 


*Gratitude is due J. C. Gilchrist, Vera 
Kanareff, Barbara Geer, Lucy Moeling, and 
Gershon Berkson, who made available the set 
of items which they arduously prepared and 
scaled. 

‘Examples of low stimuli: “One day he 
picked an empty table and began to eat his 


dinner alone.” “He had a recurring dream in 
which he was running up a hill.” “One day 
the doctor noticed that he frequently rubbed 
his nose as he talked.” Examples of high 
stimuli: “He was frightened once when left 
in the room by himself.” “He said to the 
examining physician, ‘Don’t bother with me, 
I'm no good.’?” “Once he sat on the floor and 
refused to get up.” 


sidered an abbreviated distribution of stimuli 
of different values rather than a set of equivalent 
stimuli. Under such an interpretation, “high 
stimulus” would be defined as “stimulus drawn 
from the high distribution” and “low stimulus” 
as “stimulus drawn from the low distribution.” 
This complication is a cost of using stimuli 
of such a nature that they are perceivably 
different even when of about equal value on the 
judged dimension, which is probably a desirable 
feature for experiments with sequences restricted 
to stimuli of only two values. The separation 
between distributions was adequate for much 
greater average discriminability of stimuli 
between distributions than within distributions, 
but not adequate for perfect discriminability 
between distributions. 

The items chosen were typed on 3 X 5 in. 
white cards. For convenient identification in 
data tabulation, the cards were numbered 
randomly, and the Ss were instructed that 
the numbers had nothing to do with the judging. 
The cards were presented for judgment face 
down in a pile on a table before S, ordered from 
top to bottom in the sequence in which he was 
to judge them. (Random drawing of items 
from the two distributions to fill in stimulus 
sequences was done separately for each S.) 
\ buffer set of eight blank cards was placed 
on the bottom of the pile to prevent S from 
knowing which were his last judgments. 

The S judged each item by dropping it into 
one of two slots in a fiberboard box behind the 
pile of cards. The left-hand slot, representing 
low judgments, was labelled “less unusual, 
less rare, less bizarre”; the right-hand slot, 
representing high judgments, was labelled 
“more unusual, more rare, more bizarre.” The 
S could never see which or how many items he 
had placed in either category. 

The S was told when to read and judge 
each item by a tape recording which said, 
sequentially, “next... judge... next... 
judge... .” At “next,” S picked up and 
read the top card. At “judge,” he placed the 
card in a slot. There was a constant 7-sec. 
interval from “next” to “judge” and a constant 
3-sec. interval from “judge” to “next.” 

The Ss were 128 University of Wisconsin 
undergraduates, approximately half men and 
half women in each condition. Each sequence 
was judged by a group of 16 Ss except the 
sequence with runs of 8 low and 16 high stimuli, 
which was judged by a group of 32 Ss. 


REsULTs AND Discussion 


Most of the data obtained in these 
experiments are summarized in Fig. 





wa 
nN 


CLINTON 





PROBABILITY OF & HIGH JUDGMENT 











TRIALS 

Fic 1. Probability of a high judgment for 
low ,stimuli during a run of 32 low stimuli 
(dashed curve) and for high stimuli during a 
run of 64 high stimuli (solid curves). The run 
of low stimuli was an initial run. The run of 
high stimuli was an initial run for the OL group 
and a second run following 4, 8, or 32 low stimuli 
for the 4L, 8L, and 32L groups, respectively 
Each data point is a mean probability for a 
group of 16 Ss and a block of & trials 


1, 2, and 3, where empirical prob- 
abilities of a high judgment are 
graphed as functions of trials. The 
term “trial” refers to the presentation 
and judgment of one stimulus. Each 
data point in these graphs is a mean 
probability of a high judgment for 
a group of Ss and a block of trials. 

The sequence of stimuli for one 
group of Ss began with an initial 
run of 32 low stimuli Their judg- 
ments of this run are represented in 
the dashed curve of Fig. 1. The 
probability of a high judgment for 
these stimuli is low for the early 
trials of the run, presumably because 
the Ss’ previous experience indicated 
these items were not very bizarre; 
their judgment scale was centered 
above these stimuli. However, the 
probability rises rapidly, reaching a 
level near .50 by the second block 
of eight trials, and remaining there; 
the Ss rapidly recentered their judg- 
ment scale on the low stimulus value. 
Analysis of variance showed an effect 
of blocks of trials significant at the 
O1 level in these data. 

The OL, 4L, 8L, and 32L curves of 


DE SOTO 


Fig. | represent judgments of a run of 
64 high stimuli by a group for whom 
it was an initial run and by groups 
for whom it was a second run follow- 
ing initial runs of 4, 8, or 32 low 
stimuli. Judgments of this run in- 
cluded trials 1-64, 5-68, 9-72, and 
33-96 for the various groups, but 
for simplicity they are labelled 1-64 
for all groups in Fig. 1. 
for the OL group starts near .50 
and remains there; evidently Ss 
centered their judgment scale on this 
stimulus value almost immediately 
and kept it centered there. The 4L, 
SL, and 32L curves have starting 
points progressively farther from 
(higher than) .50, clearly reflecting 
varying stages of previous recentering 
of the judgment scale on the low 
stimulus value. The more low stim- 
uli the Ss had judged, the lower was 
their AL, hence the farther above AL 
were the first high stimuli. The 
obtained systematic ordering of these 
curves would have a probability of 
less than .05 if all possible orderings 
were equally likely. The 4L, 8L, 
and 32L curves all approach .50 
during the run of high stimuli, re- 
flecting a new recentering the 
high stimulus value. Analysis of 
variance here showed an effect of 
blocks of trials significant at the .OO1 
level. 

Figure 2 represents judgments for 
the group with alternating runs of 
8 low and 16 high stimuli. Figure 
3 represents judgments for the group 
with alternating runs of 4+ low and 8 


The curve 


on 


high stimuli. During almost all the 


runs of these sequences, the prob- 
ability of a high judgment changes 
toward .50 for the stimuli in the run. 


5 The last group is the one whose first 32 
judgments are summarized in the dashed curve 
of Fig. 1. The groups given 4 and 8 low stimuli 
showed mean probabilities of a high judgment 
of these stimuli of .23 32 


and .32, respectively. 





JUDGMENTS OF SEQUENCES OF STIMULI 


At the same time, in almost all ‘cases, 
there is a parallel change away from 
50 in the probability of a high 
judgment for stimuli of the other 
value, as revealed by comparison of 
the probability for the last half of 
the preceding run with the probability 
for the first half of the subsequent 
run. These results reflect repeated 
incomplete re¢enterings of judgment 
scales for run of 
stimuli. The rate of change is great- 
est for the first run or two of each 
sequence, but slow 
denced throughout the sequences. 
Judgments of these stimuli in the 
random sequence did not show evi- 


each successive 


changes are evi- 


dence of changes in probabilities of 
a high judgment runs, 
parently because most runs were 
short 


during ap- 
very 
trial 
long; only four runs were more than 
three trials the 
consequently slight. 


- 
(33 runs were only one 


changes 
This is 
that 
scale 


long), and 
very 
consistent with a conclusion 
recenterings of the judgment 
occur in the relatively slow manner 
suggested by the gross representations 
of Fig. 1, 2, and 3 

The 


judgment for 


overall high 


the 


probability of a 


low stimuli during 





ty Of 8 HG 


eenean 











Tears 


a high judement for 
of 8 low stimuli (dashed 


stimuli 


Fic 2. Probability of 
low stimuli during runs 
and for during runs of 16 


lines) high 


high stimuli (solid lines ach data point is a 
mean probability for a gr f 32 Ss and a 
block of 4 trials on low stimul 8 trials or 
high stimul 








PROBABLITY OF 4 GH JUOGMENT 








low stimuli d 
lines) and for 


high stimuli 


s a mean probability f 


f 2 trials on lo 


high stimul 


a block 


was .21: tor high 

For the condition 
with runs of 8 low and 16 high stimuli, 
probabilities and .67 
kor the condition with runs of 4 low and 
8 high stimuli, these probabilities were 
17 and .62. The probabilities 
remarkably similar for the 
sequences. 

Note that the probabilities for high 
stimuli tend to be about half as far above 
50 as the probabilities for low stimuli 
are below .50. This indicates that the 
high stimuli are roughly half as tar above 
AL as the low stimuli are below AL, 
if the probability of a high judgment 
approximates a linear function on the 
stimulus dimension in this region around 
the AL. And since the high stimuli 
occur with twice the frequency of the 
low stimuli in these sequences, the AL, 
if computed as an amthmetic mean, 
should indeed be half as far below the 
high stimuli as it 1s above the low stimul! 

Thus when judgments are pooled over 
the trials of a and over the 
group of Ss judging the sequence, the 
resulting AL is what 
it should be according to AL theory for 
these three sequences, despite the varia- 
tions of AL within sequences which 
have been described. In Fig. 2 and 3 
the average probabilities for runs show 
considerable stability from run to run 


random 
stimull 


sequence 
it was .O8 


these were .20 


seem 
three 


sequence 


average about 





38 CLINTON DE SOTO 


of the same kind of stimuli, especially 
after the first run or two of the sequence. 
Apparently the AL enters a region where 
the changes during runs of one kind of 
stimuli balance or negate the changes 
during runs of the other kind of stimul!, 
and apparently this region is determined 
by the relative frequency (or relative 
length of runs) of the two kinds of 
stimuli, along with their discrimina- 
bility, rather than by other properties 
of the sequences. 


SUMMARY 


Two-category judgments of sequences of 
stimuli of two values were studied with the 
method of single stimuli. The stimuli were 
brief descriptions of items of human behavior; 
the dimension on which Ss judged them was 
unusualness, rareness, bizarreness of the be- 
havior. High and low stimuli were drawn 
respectively from the higher and lower of two 
narrow, disjoint distributions of items obtained 
from a pool of items scaled on the dimension 
previously. High and low judgments were 
defined for S respectively as “more unusual, 
more rare, more bizarre” and “less unusual, less 
rare, less bizarre.” 

The general finding was that during a run 
of stimuli of the same value (either value) 
the empirical probability of a high judgment 
for these stimuli changed slowly toward .50 
from any prior level. This change was ac- 
companied by a parallel change away from .50 
in the probability of a high judgment for 
subsequent stimuli of the other value. The 
probabilities changed most’ rapidly during the 
first run or two of a sequence, but continued to 
change with each succeeding run. These results 
were interpreted as reflecting continual re- 
centerings of the judgment scale on the stimulus 


dimension or continual changes in adaptation- 
level toward the value of the stimuli in a run. 

The average probabilities were similar for 
three sequences containing the same set of 
stimuli, despite marked differences in number 
and length of runs. These average probabilities 
were in accordance with adaptation-level theory. 


REFERENCES 


. Guitrorp, Jf. P. 
(2nd Fd.) 
1954. 

. Hetson, H. Adaptation-level as a basis 
for quantitative theory of frames of 
reference. Psychol. Rev., 1948, 55, 297- 
313. 

. Jounson, D. M. Learning function for a 
change in the scale of judgment. /. 
exp. Psychol., 1949, 39, 851-860. 

. Jounson, D. M. The psychology of thought 
and judgment. New York: 
1955. 

. Parpucct, A. Direction of shift in the 
judgment of single stimuli. J. exp. 
Psychol., 1956, 51, 169-178. 

. Tressett, M. E. The influence of amount 
of practice upon the formation of a scale 
of judgment. J. exp. Psychol., 1947, 37, 
251-260. 

. Tressett, M. E., & Votxmann, J. The 
production of uniform opinion by non- 
social stimulation. J. abnorm. soc. Psy- 
chol., 1942, 37, 234-243. 

8. Votkmann, J. Scales of judgment and 
their implications for social psychology. 
In J. H. Rohrer & M. Sherif (Eds.), 
Social psychology at the crossroads. New 
York: Harper, 1951. Pp. 273-294. 

. Wever, E. G., & Zener, K. E. Method 
of absolute judgment in psychophysics. 
Psychol. Rev., 1928, 35, 466-493. 


Psychometric methods. 


New York: McGraw-Hill, 


Harper, 


(Received December 20, 1956) 





Journal of Experimental Psychology 
Vol. 55, No. 1, 1958 


FACTORS IN INDIVIDUAL IMPROVEMENT IN 
SOLVING TWENTY-QUESTIONS PROBLEMS ! 


WILLIAM L. FAUST 


Pomona College 


The game of “Twenty Questions” 
was employed by Taylor and Faust 
(4) in an earlier experiment concerned 
primarily with efficiency in problem 
solving as a function of size of group. 
The “Twenty Questions” type of 
problem was found to have much 
to recommend it for use as a research 
tool in studying problem solving. 
Solution of such problems depends 
upon asking a sequence of questions 
each of which can be answered by 
“Yes” or “No” and thereby succes- 
sively limiting the possible solutions 
until the correct solution is discovered. 
Since S_ verbalizes his questions, 
these are available for objective 
analysis in both quantitative and 
qualitative study of problem solving. 

In the earlier study, Taylor and 
Faust (4) found that practice in solv- 
ing such problems resulted in improve- 
ment, evidenced by decreases in time 
and number of questions required 
to solve the problem as well as in 
number of failures. Since asking 
questions is fundamental to the pro- 
cess of solution, improvement would 
be expected to show itself also in 

1 This report is based upon part of a disserta- 
tion submitted to the faculty of the Department 
of Psychology of Stanford University in partial 
fulfillment of the requirements for the Ph.D. 
Degree. The research was carried out under 
Project NR 150-149 and supported by contract 
Nonr 25125 between Stanford University and 
the Office of Naval Research. Permission is 
granted to the United States Government for 
the reproduction, translation publication, use, 
and disposal of this article in whole or in part. 
The work of the contract was under the direction 
of Dr. Donald W. Taylor to whom the author is 
greatly indebted. The author also wishes to 
to thank Dr. Olga McNemar for her constructive 
suggestions. 


changes in the kinds or patterns of 
questions asked. 

The present research was designed 
to identify some of the important 
changes in the kinds or patterns of 
questions which are related to im- 
provement in solving Twenty Ques- 
tions problems. Only a few pre- 
viously published studies (2, 3) have 
involved a qualitative analysis of 
the changes occurring in the problem 
solving process as individuals gain 
skill in solving a particular kind of 
problem. None has involved such 
an analysis with respect to the kind 
of problem represented by Twenty 
Questions. 


PROCEDURE 


From the 60 problem topics used in the 
Taylor-Faust study (4), those 20 failed least 
often by Ss working individually were chosen 
for use in the present study. Included were 
seven animal, six vegetable and seven mineral 
topics. This selection, which served to minimize 
failures, was desirable since failures produce 
curtailed distributions with attendant difficulties 
of statistical analysis. The arbitrary limit of 
30 questions used in the earlier study, instead 
of the traditional 20, was imposed here. 

Twenty students from the introductory class 
in psychology solved the same 20 problem 
topics. Five lists were developed in which the 
same 20 topics were arranged in counterbalanced 
order so that improvement would not be con- 
founded with difficulty. Each list was organized 
into five sets of four topics. Solving one set 
of four problems constituted the task presented 
to S on each of five successive days. The lists 
were so constructed that no two topics occurred 
together within a set of four in more than one 
list; and no topic occurred on the same day in 
two different lists. The 20 Ss were assigned so 
that each of the five lists was undertaken by 
four Ss.? 


?For a more complete description of the 
procedure and also of the results, see Faust (1). 








40 WILLIAM L. FAUST 


The instructions, the procedure of answering 
questions, and the method of scoring individual 
performance was identical with that employed 

. the earlier study (4). 


RESULTS 


That Ss did actually improve in 
performance over five days is shown 
by a decrease in the mean number 
of questions from 19.6 for Day 1 to 
13.7 for Day 5, and a decrease in 
the mean time required per problem 
from 321 sec. on Day 1 to 179 sec. 
on Day 5. Both of these measures 
of improvement reveal differences 
between Day 1 and Day 5 significant 
at the .O1 level when tested by ¢ for 
correlated series. It should be noted 
that time to solve a problem is only 
partly dependent upon the number 
of questions required. Also 
important is the amount of 
between questions. This over-all im- 
provement is very similar to that 
shown by Ss working individually 
in the Taylor-Faust experiment (4). 
However, the use of simpler topics 
in the present study is reflected in 
the lower mean number of questions 
required each day. 


quite 
time 


Recurrent Questions 


In analyzing the nature of improve- 
ment occurring with practice, it was 


necessary to develop methods for 
classifying the kinds of questions 
asked. One method of classification 


dichotomized questions as either re- 
current or non-recurrent. A ques- 
tion was classified as recurrent when, 
for a given topic, it was asked by 8 
or more of the 20 Ss. It was neces- 
sary, of course, to classify as the same 
question a variety cf phrasings in- 
volving essentially the same concept. 
The number of recurrent questions 
varied from 6 to 10 among the six 
animal topics, from 5 to 6 among the 
seven vegetable topics, and from 3 


to 8 among the seven mineral topics. 

Definitive character of recurrent ques- 
tions.—An interesting and important 
issue is whether the recurrent ques- 
tions for a particular topic, when 
considered together, provide sufficient 
information to define logically or 
specify that topic. The more im- 
portant point for the present purposes 
is not whether the information is 
logically sufficient, but whether it 
is pragmatically sufficient, i.e., 
whether Ss given only that informa- 
tion can identify the goal object. 


To investigate this point, a supplementary 


experiment was conducted which employed 


as Ss 12 students from the introductory class 
in psychology at Pomona College who had no 
previous contact with research on Twenty 


Questions. These Ss were told that they were 
to play a modified game of Twenty Questions 
It was explained that instead of asking their 
wn questions, they would have read to them 
a series of questions which other Ss had asked, 
together with the answer to each question in 
turn. They were further instructed that after 
hearing each series they were to “decide on the 
basis of the information given in the questions 


and answers what the correct solution was.” 


They were told to write down in order the “tive 
most probable” answers. 
The results obtained give clear 


evidence of the definitive character 
of recurrent questions. In one case, 
the correct answer, orange, was the 
first choice of all 12 Ss. For 13 of 
the 20 topics the correct answer was 
named by at least 10 of the 12 Ss 
as one of their five choices. For 
only three of the 20 topics (all 
mineral topics) did less than one of 
the 12Ss fail to list the correct answer. 

Recurrent questions and improve- 
ment.—Since recurrent questions ap- 
pear to be fairly definitive, it might 
be expected that the improvement 
shown from Day 1 to Day 5 would 
be related to the number of these 
recurrent questions asked. ‘To test 
this possibility, analyses were con- 
ducted comparing the number of 





INDIVID 


recurrent questions asked by each 
S on the first and last day. An 
adjustment was necessary because, 


as reported above, the number of 
questions classified as recurrent, and 
hence the possible number which 
could be asked by S, varied from topic 
to topic. The adjusted measure, N, 
employed in making the necessary 
comparisons the number 
of recurrent questions asked by a 
given S divided by the _ possible 
number of questions for 
that topic. Since the total number 
of questions decreased from Day 1 
to Day 5, it would be possible that, 
although the number of recurrent 
questions might not increase, the 
percentage of such questions might 
increase. Therefore, analogously, P, 
employed in comparing percentages, 
is the percentage of recurrent ques- 
tions asked by a given S divided by 
the possible number of 
questions for that topic. 
Comparison of N scores for the 
first animal, vegetable, and mineral 
topic on Day 1 with the first topic 
of the same class on Day 5 provide 
no evidence of an 


is simply 


recurrent 


recurrent 


increase with 
practice in number of recurrent ques- 
tions asked. 

Comparison of P scores on Day 5 
with those on Day 1 provide clear 
evidence of an increase with practice 
in the percentage of recurrent ques- 
tions asked. For the 20 Ss, higher 
P scores on Day 5 than on Day 1 were 
obtained by 15 for animal, by 17 
for vegetable, and by 14 for mineral 
topics. Computation using direct 
probability shows that the difference 
between Day 5 and Day 1 is sig- 
nificant at the .03 level for animal 
and at the .0O2 level for vegetable 
topics, but just fails to reach signi- 
ficance for mineral topics. This fail- 
ure may possibly be accounted for 
by the fact previously noted that 





AL IMPROVEMENT 


IN SOLVING PROBLEMS 4] 
recurrent questions were less de- 
finitive for mineral than for other 
topics. 

In view of the finding that the 


percentage of recurrent questions is 
positively related to improvement, 
it seemed reasonable to expect that, 
as practice progressed, such ques- 
tions would tend to be asked earlier 
in the questioning. However, sup- 
plementary analysis failed to confirm 
this expectation. 

Recurrent 
SUCCESS 


questions in relation to 
and failure——The analysis 
of recurrent questions for topics 
solved and topics failed was different 
from those already described since 
varying difhculty of problems and 
unequal amounts of practice pre- 
ceding a given problem for different 
Ss would have complicated the pairing 
of solved and failed topics. For each 
S, the total number of recurrent 
questions asked, regardless of class, 
for all topics solved and similarly 
the total for all topics failed, was 
determined. 

Since various Ss solved and failed 
different problems, and since the 
possible number of recurrent questions 
which could be asked differed from 
problem to problem, it was necessary 
to use the adjusted measures, N and 
P as in the preceding analyses. When 
this was done, it was found that for 
only 12 of the 20 Ss was N greater 
for the problems solved than for those 
failed. There was no significant re- 
lation between number of recurrent 
questions asked and success or failure. 
However, analysis of the percentage 
(P) of recurrent questions showed 
that P was greater for problems solved 
than for problems failed for all 20 
Ss. 


Rated Quality of Questions 


If attention had been focused 
solely on recurrent questions, many 








42 


WILLIAM L. FAUST 


TABLE 1 


DirrerRence Day 1 Minus Day 5 1n MEAN NuMBER OF Questions Ratep “1” anp “2” 





Difference between Differences 


| Rated “2” | Rated “1” 
' 
Category os ae =e Vt. pike | fa Di GTA Raaiaeenaiaat 

| Day 1 minus | | Day 1 minus | | 2" minus “1” 

| Jay 5 P Day P | Ratings P 
Animal 4.9 05 4 N.S. 4.5 02 
Vegetable 5.8 02 1.2 N.S. 4.6 02 
Mineral 3.9 N.S —1.1 N.S. 5.0 05 





other questions which aided Ss in 
arriving at solutions, but which were 
asked by less than eight Ss, would 
have been ignored. ‘To study these 
informative questions another type 
of analysis was undertaken. 

As a first step, criteria were decided 
upon to distinguish questions that 
elicited useful information from those 
which did not. A rating of “1” 
was given to questions which sought 
to dichotomize the remaining alter- 
natives or to make a _ relatively 
symmetrical distinction appropriate 
to the stage of development at which 
the question was asked.’ All other 
questions were given a rating of 
“69 ” 

The reliability of the ratings be- 
tween the judgments of two raters on 
a random sample of 20 protocols was 
satisfactory, as shown by a tetra- 
choric correlation of .80. 

The number of “1” questions 
which could be asked was not limited, 
as was the number of recurrent ques- 
tions in the previous analyses. Hence, 
it was possible to compare directly 
the number of “1” questions asked 
on the first topic of the class animal, 
or vegetable, or mineral on Day 1 
with the number asked on the first 
topic of the same class on Day 5. 
Improvement with practice brought 
a significant reduction in the number 
of “£2”’ questions for both animal and 


5 For detailed description of the ratings see 
Faust (1). 


vegetable topics, but not a significant 
increase in the number of “1” ques- 
tions for any class of topics. (See 
Table 1.) 

Since the total number of questions 
asked decreased from Day 1 to 
Day 5, perhaps a more pertinent 
question is whether there was sig- 
nificant change in the number of “‘1”’ 
questions relative to the number of 
“2” questions over that period. When 
the difference between the differences 
was tested, the change in relative 
number of “1” and “2” questions 
was found to be significant for all 
three classes of topics, as is also 
shown in Table | 


Percentage of “Yes” Answers 


The results of another method 
of analyzing the questions will be 
reported briefly. Suppose that S 


is given the task of identifying a 
number selected at random by E 
from within known finite limits and 
is allowed to ask only questions which 
can be answered “Yes” or “No” 
The most efficient procedure would 
be to ask a series of questions, each 
of which successively divided the 
remaining alternatives in half. Since 
S would have no way of knowing 
which half of the remaining alter- 
natives was correct, on the average 
his guesses should lead to “Yes” 
answers 50% of the time. It should. 
be noted that a “Yes” or a “No” 








INDIVIDUAL IMPROVEMENT IN SOLVING PROBLEMS 


answer yield equa! information in 
the above case. It would be possible 
to solve such a problem (or a Twenty 
Questions problem) asking questions 
receiving a preponderance of “‘No” 
answers. In an extreme case, every 
question might be answered “No” 
except of course the last question 
naming thecorrect answer. However, 
in the game of Twenty Questions, 
the range of remaining alternatives 
is rarely known precisely. Probably 
few of the questions divide the re- 
maining range precisely in half. 

The percentage of “‘Yes’’ answers 4 
on both topics solved and topics 
failed was always than 50. 
However, there trend nor 
significant change in the percentage 
of “Yes” answers over the five days. 
If those questions which divide the 
remaining alternatives in half are 
considered the most efficient, then 
improvement could not have been a 
result of asking questions which were 
increasingly more “efficient.” 

An interesting fact appears when 
for each day the problems which were 
solved are compared with those which 
were failed; the former show a per- 
centage of ‘“‘Yes’’ answers between 
41.1 and 49.9, whereas the latter show 
a percentage of “Yes” 

33.9. 
consistent 


less 
was no 


answers be- 
This finding, 
with the 
expectation that more efficient ques- 
tions would tend to divide the range 
of alternatives in half and that the 
percentage of “Yes” answers to such 
questions would approximate 50. 


tween 272 and 


of course, is 


‘In this analysis, answers in the “Partly,” 
“Sometimes,” and “Not in the usual sense of 
the word” classifications were divided equally 
between the “Yes” and the “No” groups; 
actually the percentage of such answers was so 
small that this had no appreciable effect upon 
the results. In computing the percentage of 
“Yes” responses the final “Yes” answer to the 
question which solved the problem was omitted. 


Discussion 


For the Twenty Questions type of 
problem, information theory would pre- 
dict that in the long run the most 
eficient performance would result when 
each question divided the range of 
remaining alternatives in half, and 
hence when a “Yes” or “No” answer 
would be equally likely. The percentage 
of “Yes” answers for problems solved 
was closer to 50 than that for problems 
failed; however, the percentage of “Yes” 
answers did not vary with improvement 
in performance over five days of practice. 
Although no significant change in the 
percentage of ‘Yes’ answers occurred, 
the results of the analysis of recurrent 
questions and of the rated quality of 
questions agree in showing a significant 
decrease in the number of “poor” 
questions. This must mean that the 
kinds of questions dropped were not 
exclusively those receiving a “No” 
answer. It would seem that the criteria 
for dropping “‘poor’’ questions are not 
to any great extent dependent upon the 
answers received. Probably the criteria 
for selecting or rejecting questions are 
primarily dependent upon the usefulness 
of the information expected. 

Equally, the finding that the number 
of “good” questions did not increase 
with practice is interesting since much 
improvement is thought of as involving 
the evolution and improvement of an 
aspect of performance which was not 
available in the repertory of S before 
practice. Yet, there are many kinds 
of improvement in which this is not 
the case. In typical maze learning, for 
example, if S reaches the goal on the 
first trial, he must traverse the correct 
path, but he tries many blind paths 
in the process of finding this correct 
path. Improvement consists not in 
finding new correct paths, but in dis- 
tinguishing the correct paths which 
were traversed on the first trial from 
the blind paths. The correct paths as 
S improves make up an increasingly 
larger percentage of the total paths 
entered. Similarly, in Twenty Ques- 
tions, the “good” questions comprise 








+4 WILLIAM L. FAUST 


a larger percentage of the total questions 
asked on Day 5 than on Day 1. 

Yet, the process of solution of Twenty 
Questions problems differs markedly 
from the process of solution of the maze. 
In the maze S must traverse the elements 
of the correct path in order to reach the 
solution. In Twenty Questions, on the 
other hand, any one of the “good” 
questions can be omitted and the problem 
still solved. In a few cases Ss solved 
the problem without asking any of the 
recurrent questions for that topic; in 
some cases two solutions for the 
same topic involved entirely different 
questions. 

Improvement in Twenty Questions 
does not consist in learning to make 
specific responses (ask certain questions). 
Profit in learning such specific responses 
is precluded since the same topic is 
never encountered twice. At the be- 
ginning of practice Ss ask both “‘good”’ 
and “poor” questions. As Ss improve 
with concentrated practice they do not 
ask more of the “good” questions. 
Rather, improvement consists in elim- 
inating many of those kinds of questions 
which do not lead to _ profitable 
information. 


SUMMARY 


The problem investigated was: What changes 
in process lead to improvement with practice 
in solving problems of the type involved in the 
game of “Twenty Questions?” Twenty Ss 
solved four such problem topics on each of five 
consecutive days. Improvement was demon- 
strated by an average decrease from Day 1 to 
Day 5 of approximately five questions and 


about two minutes per topic. Three methods 
of analyzing changes in the questioning process 
yielded the following results: 

1. Recurrent questions, i.c., those asked by 
eight or more Ss, were found in a supplementary 
analysis to be definitive for 17 of the 20 topics 
Although there was no increase in number of 
recurrent questions, the 
questions did increase. 


percentage of such 
The number of recurrent 
questions for solved topics did not differ from 
that for failed topics. 

2. There 
the number of 
there 


was no with 


rated 


increase practice in 
“good,” but 
number rated 
“poor” for both animal and vegetable topics 
The increase from Day 1 to Day 5 in number 
of “good” relative to number of “poor” ques- 


questions 


was a decrease in the 


tions asked was significant for all three classes 
of topics 

3. The permitting 
“Yes” answers did not vary with improvement, 
but a larger percentage of “Yes” answers was 
found for topics solved than for topics failed 


percentage of questions 


REFERENCES 


1. Faust, W. L. Determinants of individual 
improvement and of group performance 
in solving certain types of verbal and 
spatial problems. Unpublished doctor's 
dissertation, Stanford Univer., 1954 

2. Rucer, H. A. The psychology of efficiency 
Arch. Psychol., YO, No. 15. 

3. Suaw, M. E. of individuals 
and small groups in the rational solution 
of complex problems. mer. J. Psychol., 
1932, 54, 419-504. 

4. Taytor, D. W., & Faust, W. L 


questions: 


\ comparison 


Twenty 
ethiciency in problem solving 
as a function of size of group. J. exp 
Psychol., 1952, 44, 360-368 


(Received December 26, 1956) 





Journal of Experimental Psychology 
Vol. 55, No. 1, 1958 


EFFECT OF BRIGHTNESS OF SIMULTANEOUS 
VISUAL STIMULATION ON ABSOLUTE 
AUDITORY SENSITIVITY ! 


RICHARD F 


‘THOMPSON, JAMES F. VOSS? AND W. J. BROGDEN 


University of Wisconsin 


The experimental findings to be 
reported represent an attempt to 
extend and clarify the results ob- 
tained by Gregg and Brogden (3) in 
their study of intersensory facilitation 
and inhibition. This latter study 
used three light intensity conditions: 
a zero light condition with the fixation 
circle consistently at a_ brightness 
of 2.050 ml.; an increase in 
brightness of .O15 ml.; and an in- 
crease in brightness of .055 ml. The 
increments in brightness of the light 
are values of the variable stimulus 
for 38° and 100°7 response on the 
difference-threshold function for which 
the zero brightness level of 2.050 
ml was the standard stimulus. Ab- 


level 


solute thresholds of auditory acuity 
were obtained for each of the three 
light conditions with the light 
companying each presentation of the 


ac- 


tone of 1000 cps The Ss who were 
instructed only to fixate the light 
source showed a statistically signifi- 
cant increase in auditory acuity with 
each increase in brightness of the 
light; Ss instructed to respond to 
both the tone and light showed a 
statistically significant decrement in 
acoustic sensitivity for the two light- 
increment conditions. There was no 
reliable difference in auditory acuity 
between the two light-increment con- 
ditions for the latter group. 

In the present study eight light 
conditions were employed. One was 

' Supported in part by the Research Com- 
mittee of the Graduate School 
granted by the Wisconsin 
Foundation. 

? Now at Wisconsin State College, Fau Claire. 


funds 
Research 


from 
\lumni 


45 


the zero differential light intensity 
and the other seven were increments 
in brightness of the light representing 
magnitudes defined by the difference 
threshold function. Absolute thresh- 
olds to a tone of 1000 cps were ob- 
tained for each of the eight light 
conditions on two groups of Ss, 
differentiated on the basis of the two 
sets of instructions employed by 
Gregg and Brogden (3). In addition 
auditory thresholds equal in number 
to those for the experimental groups 
were obtained on Ss of a control 
group in order to check on practice 
effects. The experimental design thus 
provides for examination of the facili- 
tative and inhibitory functions of 
auditory acuity and simultaneous 
light stimulation over an extensive 
range of light intensity. 


APPARATUS AND PROCEDURE 


The apparatus was essentially that of Gregg 
and Brogden (3) with certain modifications 
In brief, S sat in a sound-treated room facing a 
gray panel containing an illuminated circle of 
milk glass subtending a visual angle of approxi- 
mately 43’. Under the light condition 
the brightness of the visual stimulus matched 
that of the surrounding panel. 

Zero light intensity was produced by a 100-w. 
incandescent bulb and fixed filters. The light 
intensity increments obtained from the 
same source, using fixed filters and a wedge ‘ 
and balancing The increment light 
intensities were cast on the milk glass screen 
by a solanoid-driven movable mirror, thus 
producing constant increments in the light 
intensity of the fixation circle viewed by S. 
Because of changes in general conditions from 
those used by Gregg and Brogden (3), a new 
normative group of 30 Ss was used to obtain a 
difference threshold function to serve as a basis 
for determining the light conditions for the 


zero 


were 


wedge. 





+6 R. F. THOMPSON, J. F. VOSS, AND W. J. BROGDEN 


TABLE 1 
Vatues or Licut Stimuut in Terms oF 
Bricutness, A J/7] anp Per Cent 
FREQUENCY OF RESPONSE ON THE 
DIFFERENCE THRESHOLD 





FuncTIon 

: , | &% Fre- 

Light | Brightness | 1 abba 
Condition | in ai/t —_- 

l 2.090 000 0 

2 2.105 007 4 

3 2.121 O15 | 25 

4 | 2.137 022 | 50 

5 2.152 | 030 | 75 

6 2.215 | 060 100 

7 2.340 120 100 

8 2.591 240 100 

experiment. The brightness of the fixation 


circle, under the zero light increment condition 
was 2.090 ml., as measured by a Macbeth 
illuminometer. The values for all light condi- 
tions are given in Table 1 in terms of brightness, 
in terms of A //J, and in terms of the percentage 
frequency of response on the difference threshold 
function. The increment in brightness and in 
terms of AJ/J represent a geometric series 
except for Cond. 4 which represents a half-step. 
This light increment is the threshold value and 
was included in order to provide representation 
of the threshold light intensity among the values 
of the independent variable. The auditory 
stimulus, maintained at 1000 cps throughout 
the experiment was presented to S via a cali- 
brated Wein bridge oscillator, attenuators, a 
matching transformer, and two loudspeakers 
mounted beneath Ss table. 

All Ss of the two experimental groups and 
one control group were given the same instruc- 
tions as those given verbatim in the article of 
Gregg and Brogden (3). The Ss of experi- 
mental Group I were given the additional 
instructions about responding to the light that 
Gregg and Brogden used for Ss of their Group I. 
Five practice tones, approximately 18 db above 
threshold were given before actual thresholds 
were taken. In determining the auditory 
thresholds, 2-db steps between stimulus in- 
tensities were used and the threshold intensity 
of the tone for 50% response was recorded in 
decibels of attenuation. A warning buzzer 
with a duration of 300 msec. preceded the tone 
by 1000 msec., and the tone (alone or with 
light) was presented for 2000 msec. All auditory 
thresholds were obtained by the modified form 
of the method of limits employed by Gregg and 
Brogden (3). Twenty-four threshold deter- 
minations were made for each S. Thus, there 


were three thresholds taken for each of the 
eight light conditions for each S of the two 
experimental groups. 

A total of 48 Ss, volunteers from the courses 
in elementary psychology at the University of 
Wisconsin, served in the experiment with 16 
assigned toeach group. ‘Two £s ran eight of the 
Ss in each of the three groups. The problem 
of sequence of light condition is critical in 
the treatment of the experimental groups. 
Normally this would be solved by the use of an 
8X8 Latin square. Such a design in the 
present experiment, however, would not make 
it possible to maintain the differentiation 
between the two experimental groups provided 
by the two sets of instructions. Since four of 
the eight light intensities were above the dif- 
ference threshold, an S in the group instructed 
only to fixate on the light (Group II), if pre- 
sented with an above-threshold light, intensity 
early in the sequence, might make an active 
attempt thereafter to discriminate light in- 
tensities as well as tonal intensities. Such an 
S would be responding in essentially the same 
manner as the Ss instructed to respond to both 
the tone and the light (GroupI). Consequently, 
the zero light intensity and the three intensities 
below the threshold were used in one 4X 4 
Latin square and the four intensities above 
the threshold were used in a second 4X 4 
Latin square. The Ss were assigned at random 
to a row of the first square and after completing 
it, completed the same row of the second square 


RESULTS 


The threshold data were adjusted 
to zero sensation level for the zero 
light-increment condition by the same 
method uséd by Gregg and Brogden 
(3). The median threshold for the 
zero light condition was taken as the 
zero sensation level for that S, i.e., 
his normal threshold of audibility. 
The median threshold for each of the 
other seven light conditions was 
subtracted from the median for the 
zero light condition for each S to 
transform the raw data in terms of 
sensation level. The results of an 
analysis of variance performed on 
these data for the three groups of Ss 
are given in Table 2. This analysis 
is a modification of standard Latin 
square analysis dictated by our ex- 














perimental design of a Latin rectangle 
(two 4X 4 Latin squares side by 
side). Since the hypothesis of homo- 
geneity of variance was not confirmed, 
it was decided to use the 1% level 
of significance in lieu of the 5% level 
in evaluating the results of the 
analysis of variance and all subse- 
quent statistical tests of these data 
None of the fractional sources of the 
between-Ss variation are significant. 
Since the F ratio of the Residual- 
between-cells to the Residual-within- 
cells is significant, the Residual-be- 
tween-cells mean square was used as 
the denominator in all F tests for 
the fractional sources of the within-Ss 
variation. Light conditions, the in- 
teraction cf Light Conditions and 
Experimental Conditions (instructions 
and treatment for the two experi- 
mental and one control group), and 
Design Uniqueness are statistically 
significant sources of variation. The 


TABLE 2 


\NALYsIS OF VARIANCE OF AUDITORY 
Turesnoips (Att Groups) 


Source of Variation df MS F 


Total between Ss 47 
Experimental 


Conditions (EC) 2! 4,830 79 
Sequence (S) 3) 17,284! 2.81 
Experimenter (F£) 1 28,970) 4.71 
ECxS 6\ 4,962 80 
EC XK E 2) 4,521 74 
SxXE 3, 5,746 93 
ECXSXE 6| 1,798 29 
Ss within Sequences 24 6,148 

Total within Ss 1104 - 
Light 

Conditions (LC) 7! 8,203 | 12.19* 
LC xX EC 14| 1,597} 2.379 
LOCKE 7; 851) 1.26 
LC xX EC xX I 14 812! 1.21 
Design 

Uniqueness (DU) 14} 2,211} 3.29° 
DU K EC 28| 1,166) 1.73 
DU XE 14} 1,212] 1.80 
DUXECXKE 28 1,010) 1.50 

* Residual between Cells | 210 673| 5.18* 
Residual within Cells 768! 130 


*P = O1 


VISUAL STIMULATION AND AUDITORY SENSITIVITY 47 














1 poe 
a 
i »e 
2 oe ; 
Fi 
es r 4 
a \ 
. 
> = , 
- - . f4 
: . / 
- / 
. ‘ 4 
a ‘ 
$ *- wae 
: 2 ————— 
a. ae SS) | he EAS et tm 
B grirese ccreeee . © orterts one ated 
Fic. 1. The functional relation of auditory 


acuity and intensity of simultaneous light with 
instructions as the parameter. 


significance of the Light Conditions 
variable indicates that auditory acuity 
varies with change in the light 
intensity accompanying presentations 
of the tone. The significant inter- 
action of Light Conditions and Ex- 
perimental Conditions further shows 
that auditory sensitivity is differen- 
tially affected by light intensity as a 
function of the instructions and 
treatment of the Ss in the three 
groups. The significance of Design 
Uniqueness suggests that the particu- 
lar sequences of light conditions are 
related in some way to the other 
experimental variables. 

The first step in further analysis 
of the data was a check on the 
possibility of a practice or other 
temporal effect on the thresholds. 
An Alexander trend test (1) was 
performed on the data of the control 
group. There wasnosignificant group 
slope and no significant group devia- 
tion from linearity. Since these data 
are best represented by a straight line 
of zero slope, there is no evidence of 
a practice or other temporal effect. 

The results for the control group 
make it possible to consider the data 
of the two experimental groups solely 
in terms of the light conditions and 
the instructions. Figure 1 presents 











48 R. F. THOMPSON, J. F. VOSS, AND W. J. BROGDEN 


a plot of the mean deviation of audi- 
tory thresholds from the sensation 
level provided by the threshold at the 
zero light intensity as a function of 
increment in light intensity. It is 
possible to consider these data from 
at least two points of view. In the 
one case, the data for each group are 
considered to represent continuous 
functions of change in auditory acuity 
over the full range of change in the 
brightness of the light. Evaluation 
from this point of view requires the 
fitting of curves to each set of data 
and determining the differences be- 
tween the fitted functions. In the 
other case, the data can be considered 
to demonstrate two separate effects, 
an inhibitory effect for Group I at 
the two sub-threshold light increments 
and a facilitative effect for both 
groups at the four supra-threshold 
light intensities. In the latter case, 
evaluation of the data is in terms of 
differences jn auditory acuity between 
Groups I athd II for the sub-threshold 
light intensities, between the zero 
level of auditory acuity for Group II 
at the sub-threshold light intensities 
and the level of auditory acuity for 
both Groups I and II, and between 
Groups I and II for the four supra- 
threshold light intensities. 

A trend analysis (equal steps pro- 
vided by omission of data for incre- 
ment of .047 ml.) with orthogonal 
polynomials (2) showed no significant 
differences between group trends, 
but did show significant differences 
between individual means, and sig- 
nificant overall trend with significant 
linear and quintic components. Sep- 
arate analyses of the two sets of data 
showed significant differences between 
individual means and significant over- 
all trend with significant linear and 
quintic components for the data of 
Group I and with a significant linear 
component only for the data of Group 








Il. These results do not support any 
definite conclusions about the form 
of the two functions or a difference 
between them. The difference ob- 
tained by the separate analyses is 
probably due to the apparent inhibi- 
tory effect of the sub-threshold light 
increments on auditory acuity for the 
Ss of Group I. This difference also 
makes it impossible to treat both 
functions in terms of the overall 
trend (significant linear and quintic 
components) obtained in the 
bined analysis. 

Evaluation of the hypothesized 
inhibitory effect for Group I at sub- 
threshold light intensities and the 
hypothesized facilitating effect for 
both groups at the supra-threshold 
light intensities was accomplished by 
means of ¢ tests. The between- 
individual trends variance was used to 
compute an estimate Of Oiwean aiff 
The results of one-tailed t tests show 
the combined mean (.66 db) for 
Group | at the light increments of 
0157 ml. and 0313 ml. to be sig- 
nificantly greater, at the 1°; level, 
than the comparable mean for Group 
I] (—.16 db). The combined mean 
for Group I is significantly greater 
than zero whereas that for Group II 
is not. The means of —.25 db for 
Group I and —.03 db for Group II 
at the light increment of .047 ml. 
do not differ significantly from each 
other or from the zero level. The 
combined mean for the four supra- 
threshold light increments  (.063 
— 402 ml.) for Group I of —2.06 db 
is significantly (P = .O1) larger (more 
negative) than the comparable com- 
bined mean of —1.33 db for Group IT. 
Both of these combined means are 
significantly different from zero. 


com- 


Discussion 
The analyses of the results provide 


greater support for the second than 











VISUAL STIMULATION AND AUDITORY SENSITIVITY 49 


for the first of the views of the data 
presented above. Thus, it may be 
concluded that with information about 
the light prior to its presentation and 
with instructions to respond to it (Group 
1), there is an inhibition of auditory 
acuity at sub-threshold light increments 
whereas no information and no instruc- 
tions (Group II) result in no effect of 
sub-threshold light intensities upon au- 
ditory acuity. At the threshold value 
of the light, there is no effect of either 
type of instructions upon auditory 
acuity. Supra-threshold light — incre- 
ments clearly provide for a facilitation 
of auditory acuity regardless of instruc- 
tions, with a greater facilitation for the 
Group II instructions than those for 
Group |. This latter effect is probably 
due to the requirement of Ss in Group I 
that they report on the occurrence of the 
light after they have reported on the 
tone. Preparatory reception for the 
light and reporting on it may interfere 
with auditory sensitivity and thus reduce 
the magnitude of the facilitation pro- 
duced by the supra-threshold light 
increments. In addition, this factor in 
Ss of Group I undoubtedly accounts 
for the inhibitory effect of the sub- 
threshold light intensities on auditory 
acuity. 

Although the view that the charges 
in auditory acuity represent a continuous 
function of change in the brightness of 
the light cannot be discarded upon the 
basis of the analyses of the data, there 
is little support in the analyses for this 
view and there are additional arguments 
against it. In the case of Ss of Group I, 
the inhibitory effect may be due entirely 
to the instructions about the light, with 
there being no effect of the sub-threshold 
light intensities as such. This hypo- 
thesis is supported by the results of 
Group II for which there was no effect 
of the sub-threshold light intensities, 
and by the lack of any effect of the 
threshold light intensity for either group. 
The effect of the light per se on auditory 
acuity is one of facilitation, and the 
effect can occur only when the light 
intensity is above the difference thresh- 
old. There seems to be no basis for 


expecting a differential effect for bright- 
ness of the light on auditory acuity once 
the intensity is above the threshold. 
If the instructions do interact with the 
effect of brightness, the only possibility 
seems to be a slight inhibitory effect 
of the instructions for Group I that is 
of constant magnitude for al] supra- 
threshold light intensities. 

This theoretical position is consistent 
with that presented by Gregg and 
Brogden (3). The present experiment, 
however, has produced results that are 
not completely consistent with the 
results obtained by Gregg and Brogden. 
For their Group I, the effect of instruc- 
tions was inhibitory to auditory acuity 
for both the sub-threshold and supra- 
threshold light intensities; whereas, for 
Group II facilitation occurred for both 
light increments. Our results confirm 
the inhibitory effect for Group I at the 
sub-threshold light intensity and the 
facilitative effect for Group II at the’ 
supra-threshold light intensity. The dif- 
ferences in results are likely due to 
the difference in the number of light 
intensities and in the sequence in which 
they were presented to Ss. In view 
of the relatively consistent results for 
the sub-threshold and supra-threshold 
light intensities for Groups I and II, 
the present results probably reflect 
more adequately the effect of the ex- 
perimental variables on auditory acuity. 
However, the results obtained by Gregg 
and Brogden (3) were contaminated by 
a practice effect and although the results 
from the control group suggest that 
this factor was apparently not operative 
in the present experiment, the signifi- 
cance of the Design Uniqueness term 
in the analysis of variance (Table 2) 
may be an indication of a significant 
practice effect as a function of light 
intensities. It is more likely that this 
reflects an effect produced by the present 
experimental design in which the four 
lowest light intensities always occurred 
prior to the four supra-threshold light 
intensities. 

There has been little additional re- 
search in this area reported since 
publication of the Gregg and Brogden 














50 RK. F. THOMPSON, J. F. VOSS, AND W. J. BROGDEN 


_ study (3). In a survey of the Russian 
literature, London (4) notes that the 
presence of light has been reported to 
increase auditory sensitivity, whereas 
the absence of light lowers it. He 
notes further that green light has a 
facilitative effect but red light has an 
inhibitory effect. In connection with 
this latter effect, O’Hare (5) has reported 
a study comparing the effects of different 
wave lengths of light upon the auditory 
threshold. Unfortunately, wave length 
was confounded with brightness and 
the results were analyzed by improper 
statistical treatment. These factors, 
together with inconsistent results make 
it impossible to interpret the experiment 
meaningfully. 

The outstanding feature resulting from 
consideration of the literature on the 
effect of visual stimulation upon auditory 
sensitivity is the lack of consistency in 
results. Much of the inconsistency is 
due to inadequate experimental design 
and statistical treatment of the data. 
Some is probably due to the effect of 
parameters such as instructions, prac- 
tice, and conditions relating to light 
stimuli and their mode of presentation. 
It is evident that the effect is one of 
facilitation, the magnitude of the effect 
is small, and that it is altered by in- 
structions given to S, by the intensity 
of the light at least between sub-thresh- 
old and supra-threshold intensities, and 
may be altered by other conditions. 
It appears unlikely that any precise 
functional relations exist between au- 
ditory acuity and simultaneous light 
stimulation and that detailed investiga- 
tions pointed toward discovery of precise 
functional relations are fruitless at the 
present time. 


SUMMARY 


Auditory thresholds were measured under 
eight conditions of a simultaneous visual 
stimulus with Ss in two experimental groups 


each under different instructions about the 
presence of the light and the requirement to 
report on it. A control group to check on 
practice or other temporal effects was given 
the same auditory treatment as the experimental 
groups but all thresholds were taken under the 
zero light condition. For the experimental 
groups, in addition to the zero light intensity 
seven light increments defined in terms of the 
difference threshold, some below and 
above, were used. 

Analysis of variance of the data for the three 
groups showed significant results for light 
conditions, the interaction of light conditions 
and experimental conditions (instructions and 
treatment of the three groups), and design 
uniqueness. A separate trend analysis of the 
data for the control group showed no significant 
trend, and it was therefore assumed that no 
practice or other temporal effect was present 
in the data of the experimental groups. Analy- 
ses of the data for the two experimental groups 
supports the conclusion that at sub-threshold 
light intensities, the instructions for Group I 
about the presence of the light and a require 
ment to report on it, have an inhibitory effect 
on auditory acuity, and that the supra-threshold 
light intensities have a facilitative effect on 
auditory acuity, but that the instructions for 
Group I result in less facilitation than do those 
given to Group II where no mention of the light 
was made. The results and conclusions are 
discussed in terms of relevant literature. 


some 


REFERENCES 


1. Acexanper, H.W. A general test for trend. 
Psychol. Bull., 1946, 43, 533-557. 

2. Grant, D. A. Analysis-of-variance tests in 
the analysis and comparison of curves 
Psychol. Bull., 1956, 53, 141-154. 

3. Grecc, L. W., & Brocpen, W. J. The 
effect of simultaneous visual stimulation 
on absolute auditory sensitivity. /. 
exp. Psychol., 1952, 43, 179-186. 

4. Lonpon, 1. D. Research on sensory inter- 
action in the Soviet Union. Psychol. 
Bull., 1954, 51, 531-568. 

5. O’Harg, J. J. Intersensory effects of visual 


stimuli on the minimum audible 
threshold. J. gen. Psychol., 1954, 5A, 
167-170. 


(Received January 2, 1957) 








Journal of Experimental Psychology 
Vol. 55, No. 1, 1958 


DIFFERENTIAL CONDITIONING AND 
INTENSITY OF THE UCS! 


W. N. RUNQUIST, K. W. SPENCE, AND D. W. STUBBS 


State University of lowa 


A recent series of studies (6, 8, 9, 
10, 11, 12, 13, 14) of the factors 
determining performance in classical 
defense conditioning have, in general, 
tended to support the Hullian inter- 
pretation that response strength is a 
multiplicative function of the learn- 
ing factor (/f) and the drive factor 
(D), ic, R = f(E) = (HX D). 
These studies have also presented 
evidence supporting the further theo- 
retical interpretation that drive (D) 
level in such aversive conditioning 
situations is a positive function of the 
level of emotional responsiveness in- 
duced in S by the noxious UCS or 
other accompanying aversive stimuli. 
According to this interpretation, the hy- 
pothetical emotional response evoked 
by the noxious UCS is assumed to 
be relatively persistent, extending 
well beyond the intertrial intervals 
typically employed in such experi- 
ments. Thus the drive (D) level 
present at the moment of occurrence 
of a conditioned anticipatory response 
is conceived to depend upon the level of 
persisting emotional activity aroused 
by the noxious stimulation on the 
preceding trials. This level will, in 
part at least, be a function of the 
intensity or noxiousness of the UCS. 

Also in accord with this theoretical 
conception are the investigations (10, 
13, 14) in which individuals varying in 
emotional responsiveness as deter- 


!This study was carried out as part of a 
project concerned with the influence of motiva- 
tion on performance in learning under contract 
N9onr-93802, Project NR 154-107 between 
the State University of lowa and the Office of 
Naval Research. 


mined by an independent test (Mani- 
fest Anxiety Scale) have been shown 
to perform at different levels in 
classical aversive conditioning. The 
interpretation of this finding is that 
Ss who score at the extremes (e.g., 
upper and lower 20%) of this anxiety 
scale react to the same UCS with 
different degrees of emotionality and 
thus have different levels of D opera- 
tive. 

The above theoretical schema has 
also been applied to the more complex 
differential conditioning situation. In 
such conditioning one stimulus (S*) 
is always reinforced while the second 
stimulus (S~) is never reinforced. As 
habit strength (//) to S* develops 
with successive reinforced trials, gen- 
eralization of this H to S~ is assumed 
to occur, the amount of the general- 
ized habit (A) being a function of 
(a) the amount of H developed to S*, 
and (b) the distance between S* and 
S- on the dimensions along which 
they vary. 

Applying our theory of the manner 
in which D and #H interact multiplica- 
tively to determine response strength, 
it will be seen that high-drive Ss 
should respond at a higher level than 
low-drive Ss to both the S* and the 
S-. A further prediction made by 
the theory is that the difference 


between the excitatory strengths (E+ 
and E-) of the positive and negative 
stimuli should be a function of the 
level of D, i.e., the higher the drive 
level, the better should be the dis- 
crimination between the two stimuli. 
The derivation of this implication is 





WwW 
tro 


as follows: 


E+=HxD 
E-=HxD 
E+ — E- = D(H — fl) 


Evidence bearing on these predic- 
tions has been provided by three 
separate experiments in which the 
level of D has been varied by selecting 
Ss on the basis of their anxiety scale 
scores (9, 11). The results have 
supported the predicted differences 
in the case of the positive stimuli, 
but not in the case of the negative 
stimuli. Thus high-anxious Ss had 
significantly greater mean E£* values 
than low anxious Ss, whereas the 
mean E~ values for the two groups 
were not significantly different. How- 
ever, in five independent instances 
the high anxious Ss gave the higher 
E values. A _ similar finding was 
obtained in the case of the index of 
degree of discrimination (E* — E>). 
In five independent instances the 
value of this index was also greater 
for the high-anxious Ss than for the 
low-anxious Ss, although none of the 
differences was significant. 

The present study attempts to 
extend the investigation of this theo- 
retical schema and its implications 
by studying the effects on level of 
response to and degree of differentia- 
tion between S* and S~ of varying the 
level of D by direct manipulation 
of the intensity of the UCS. 


MeEtTHOD 


Subjects. —The Ss were 60 women enrolled in 
courses in introductory psychology. Each 
S was assigned to one of two groups in order of 
appearance at the laboratory. These groups 
were differentiated only in terms of the intensity 
of the UCS used in conditioning, with Group $ 
receiving a 2.0 lb./sq.in. air puff and Group W a 
3 lb./sq.in. air puff. An additional 31 Ss were 
run, but were discarded for various reasons. 
Failure to meet a criterion previously established 
(10, 13) for the exclusion of voluntary responders 








W. N. RUNQUIST, K. W. SPENCE, AND D. W. STUBBS 


eliminated 11 Ss in Group S and 7 Ss in Group 
W.. The high incidence of voluntary responders 
is probably due to the fact that many Ss knew 
about conditioning from their course work. 
Six Ss were eliminated because they gave CR’s 
during pre-test trials. Two Ss in Group W 
and one in Group S were discarded for failing 
to give a single CR to the positive stimulus 
during conditioning. The remaining four Ss 
were dropped due to procedural errors or 
apparatus failure. 

Apparatus.—The equipment for recording 
eyeblinks and presenting the UCS was identical 
with that used in a previous study (13). The 
CS was a tone produced by a loudspeaker 
driven by a Hewlitt-Packard oscillator. The 
positive CS (S*) was 500 cycles, the negative 
CS (S~) was 5000 cycles. The sound pressure 
level of the tones at the position of S’s head 
was 70 db as measured by means of a General 
Radio sound level meter, Type 759-B. The 
duration of S* was 550 msec. with the UCS 
occurring 500 msec. after its onset. The dura- 
tion of S~ was 2550 msec. with the UCS occurring 
at 2500 msec. This latter interval has been 
shown to produce extinction of a CR established 
at 500 msec. (4,5). The duration of the UCS 
was limited to 50 msec. by means of a 110 v., 
60 cycle, A.C. operated solenoid valve controlled 
by an electronic timer. 

A CR was recorded when the record showed 
a deflection of 1 mm. or more in the interval 
150 to 500 msec. following the onset of the CS 
Responses with a latency less than 150 msec 
were classified as original responses to the tone 
and were not included in the data 

Procedure.—Following reading of the in- 
structions, each S received two presentations 
of the positive CS and two presentations of the 
negative CS alone. Finally a single presenta- 
tion of the UCS alone was given. The order 
of presentation was pre-arranged so that within 
a block of ten trials, there were five presentations 
each of S* and S~, with the restriction that no 
more than three S* trials nor more than two 
S- trials occurred in succession. A verbal 
ready signal was given S from 2 to 4 sec. before 
the onset of the CS. The Ss were instructed 
to blink and then fixate a dimly illuminated 
milk glass disc upon presentation of the ready 
The intertrial intervals (time between 
ready signals) were 15, 20, and 25 sec. according 
to a fixed schedule. 


signal. 


All Ss were questioned 
as to the nature of the experiment at the con- 


clusion of the session and warned not to discuss 
Although 40% 
of the Ss indicated some knowledge of condi- 
tioning, this did not differentially affect the 
two groups. 


the experiment with anyone. 











~ 
oO 
T 


ucs 


INTENSITY 


oo 
ie) 
T 


) 
T 


PER CENT CR'S 
v  S 
a -) 
T T 


ro 
° 
T 


° 
T 





DIFFERENTIAL CONDITIONING 





ANXIETY 
Spence & Forber 


PF es 


La-st 

-o -S™ 
a” HA-S 
-" e--~«LA-S~ 










J i i 1 ] 





Fic. 1 
i 


present study 


and fr 


RESULTS 


The acquisition curves of condi- 
tioning to both the and 
negative stimulus are presented in 
the left part of Fig. 1, in terms of 
the percentage of CR’s in each block 
of 20 trials, 10 of which were S* trials 
and 10 S trials. Clearly Group §, 
which had the strong air puff, is 
superior to Group W in performance 
both to the positive and negative 
cue stimuli as well as showing better 


positive 


discrimination. Statisticalevaluation 
of these results was made by the 
Mann-Whitney test. The data for 


the last 20 trials were eliminated in 
this analysis for two reasons: (a) a 
a comparison was desired with pre- 
vious studies in which conditioning 
was only carried out to 100 trials, and 
(b) it was felt that some sort of 
fatigue effects may have differentially 
influenced the groups, since Group S 
was making more frequent and more 
vigorous eyelid responses. The elim- 


Percentage of CR’ 


i 2 3s 4 5 


ie henind Lamenn ‘io “Oras 


n blocks of 10 trials for the 
Spence and Farber (11) 


ination of these trials does not change 
the conclusions, although the prob- 
abilities in general tend to be some- 
what smaller when these data are 
included. 

The difference between the number 
of CR’s? to the positive stimulus for 
the two groups was significant at the 
02 single tailed test 
(x = number of CR’s 


likewise 


level for a 

2.08). The 
to the negative 
produced a significant difference 
1.93, P = 3), dis- 
crimination (CR’s to S* minus CR’s 
to S~) was not significantly better 
for Group S as opposed to Group W 
(c = .£86, P = .19). 


stimulus 


(2 = 


however, 


2 Although the predictions are in terms of £, 
the statistical analyses were made on the 
number of CR’s. Since a non-parametric 
test, which depends only on the rank-order 
of the numbers, was used in all instances, the 
values of z obtained would be the same regardless 
of whether the data were converted to E values 
or not. 














54 W. N. RUNQUIST, K. W. SPENCE, AND D. W. STUBBS 


Discussion 


Despite lack of significance, the data 
add more independent evidence con- 
firming the theoretical predictions con- 
cerning differential conditioning. In 
four experiments reported, differences 
in the predicted direction have been 
obtained in almost every comparison. 
In order to get a picture of the similarity 
of the results obtained here with those 
obtained when D is manipulated by 
the selection of Ss on the Manifest 
Anxiety Scale, comparable acquisition 
curves from the data obtained in Exp. 
II by Spence and Farber (11) are plotted 
in the right half of Fig. 1. This par- 
ticular experiment was run under con- 
ditions nearly identical with those of 
the present study, except that high- and 
low-anxious Ss were ‘used with a con- 
stant UCS intensity. In obtaining these 
curves, only those Ss who were classified 
high or low anxious on the Taylor Scale 
were used, thus eliminating a number 
of Ss who were included in the previous 
report on the basis of scores on the 
Heineman Forced Choice Scale (1). 
The latter test has been found to have 
low reliability. It can be seen that 
the curves are quite similar to those 
obtained in the present study with the 
order of the curves the same in both 
studies. 

Statistical analysis of these data 
also leads to practically the same con- 
clusions as in the case of UCS intensity. 
Comparison of the number of CR’s 
between the high- and _ low-anxious 
groups by means of the Mann-Whitney 
test gave z’s of 1.62 for the positive 


TABLE 1 


OBTAINED z VALUES ON COMPARISONS 
Between Drive Groups Mabe on 
Upper Two-tuirps oF Ss 





Comparison 
UCS intensity }2.96| 001 | 1.64) .05 | 1.83 | .03 
S>W 


| 
HiA>LoA | — 01 | 1.47| .07 | .90}.18 
| | 


stimulus, 1.10 for the negative stimulus 
and 1.28 for the difference between 
St and S~. Only the first of these is 
significant at the .05 level on one tail. 
Taking the two studies together, half 
of the predicted differences were sig- 
nificant although all six were in the 
predicted direction. 

One factor which undoubtedly con- 
tributes to the lack of significance is 
the great variability of behavioral data, 


even in such simple experiments as 
classical conditioning. One method of 
reducing this variability is to treat 


only selected portions of the data (7). 
If the assumption is made that within 
a given drive group, differences in the 
total number of CR’s is primarily a 
function of differences in the amount 
of H developed, then eliminating Ss 
who show little evidence of conditioning 
eliminates those who have low amounts 
of H. According to the theory, the 
amount of conditioning is determined 
by D multiplying H. If the value of 
H is small, variations in D will not 
produce as large differences in perform- 
ance as when the value of H is large, 
thus the predictions should appear 
more clearly in Ss who showed a great 
deal of conditioning. 

These implications were tested in 
both the present study and in the 
Spence and Farber study. by considering 
only the top two-thirds of the Ss. The 
division was made on the basis of the 
total number of CR’s to the positive CS, 
and eliminated all Ss who did not show 
a fairly large amount of conditioning. 
Comparison between drive groups on 
the number of CR’s on Trials 61-100 
to the positive stimulus, negative stimu- 
lus, and amount of discrimination were 
made and evaluated by means of the 
Mann-Whitney test. The obtained val- 
ues of z are shown in Table 1. The 
results in the case ot the UCS intensity 
experiment are clear cut, as all three 
response measures show superiority sig- 
nificant at the .05 level or above for the 
high drive group. In the anxiety ex- 
periment the results are not as clearly 
supportive. In two of the three com- 
parisons, the reliability of the differences 








DIFFERENTIAL CONDITIONING 


increased when the poor conditioners 
were eliminated with the differences 
to the negative stimulus now approaching 
significance. Considering the two ex- 
periments, four of six comparisons 
showed an increase in significance level 
despite the fact that there were fewer 
Ss used in making these comparisons. 

As has been pointed out previously, 
the analysis of differential conditioning 
is undoubtedly oversimplified in that 
factors such as inhibition and set have 
not been considered. For example, Hil- 
gard, Campbell, and Sears (2, 3) have 
shown that knowledge of the stimulus 
relationships seems to facilitate dis- 
crimination. In the present study Ss 
who indicated either knowledge in terms 
of the discrimination or knowledge 
concerning conditioning showed better 
overall performance and somewhat better 
discrimination. However, until more 
evidence is obtained concerning the role 
of these factors in simple conditioning, 
it is impossible to apply them to the 
more complex differential conditioning 
situation. 


SUMMARY 


An experiment was conducted to extend the 
data relating drive strength to differential 
eyelid conditioning by varying the intensity 
of the UCS. Each S received 60 trials with 
a 500-cycle tone as the positive CS, and 6 
trials with a 5000-cycle tone as the negative 
CS in a pre-arranged order. The interval 
between the onset of the positive CS and the 
UCS was 500 msec.; the interval between 
the negative CS and the UCS was 2500 msec. 
‘Two groups were run, Group S having a 2.0 
Ib./sq.in. air puff and Group W having a .3 
Ib./sq. in air puff. 

The number of CR’s to the positive stimulus 
and to the negative stimulus was significantly 
greater for Group S. In agreement with studies 
using high and low anxious Ss, the discrimination 
was also better for Croup S, but the difference 
failed to achieve significance. When only 
the top two-thirds of the Ss are considered, 
all of the predicted differences are significant. 
The findings are consistent with expectations 
based on Hullian notions relating drive and 
habit, and with results of experiments relating 
anxiety and differential conditioning. 


to 


“" 


~ 


. Hernemay, C. E. 


. McAuruster, W. R. 


. Spence, K. W. 


. Spence, K. W., 


. Tayvor, J. A. 


MN 
“WV 


REFERENCES 


A forced choice form of 
the Taylor Anxiety Scale. J. consult. 
Psychol., 1953, 17, 447-455. 


. Hivearn, E. R., Campperr, A. A., & SEARs, 


W. N. Conditioned discrimination: the 
development of discrimination with and 
without verbal report. Amer. J. Psy- 
chol., 1937, 49, 564-580. 


. Hirearn, FE. R., Camppert, R. K., & Sears, 


W. N. Conditioned discrimination: the 
effect of knowledge of stimulus relation- 
ships. Amer. J. Psychol., 1938, 51, 
498-506, 

Eyelid conditioning 
as a function of the CS-UCS interval. 
J. exp. Psychol., 1953, 45, 417-422. 


. McAuuster, W. R. The effect on eyelid 


shifting the CS-UCS 
Psychol., 1953, 45, 


conditioning of 
interval. J. exp. 
417-422. 

Passty, G. E. The influence of intensity 
of unconditioned stimulus upon acquisi- 


tion of a conditioned response. J. exp. 
Psychol., 1948, 38, 420-428. 

. Spence, K. W. Mathematical formula- 
tions of learning phenomena. Psychol. 


Reo., 1952, 59, 152-160. 

Learning and performance 
in eyelid conditioning as a function of 
the 


intensity of the UCS. J. exp. 
Psychol., 1953, 45, 57-63. 
. Spence, K. W., & Beecrort, R. S. Dif- 


ferential conditioning and level of anxiety. 
J. exp. Psychol., 1954, 48, 399-403. 

& Farper, I. FE. Condi- 
tioning and extinction as a function of 


anxiety. J. exp. Psychol., 1953, 45, 
116-119. 
. Spence, K. W., & Farper, I. E. The 


relation of anxiety to differential eyelid 
conditioning. J. exp. Psychol., 1954, 
47, 127-134. 


. Spence, K. W., Farper, I. E., & Taytor, 


FE. The relation of shock and anxiety 
to level of performance in eyelid condi- 


tioning. J. exp. Psychol., 1954, 48, 
404-408. 
. Spence, K. W., & Taytor, J. A. Anxiety 


and strength of the UCS as determiners 
of the amount of eyelid conditioning. 
J. exp. Psychol., 1951, 42, 183-188. 

The relationship of anxiety 
to the conditioned eyelid response. 
J. exp. Psychol., 1951, 41, 81-92. 


(Received January 7, 1957) 








Journal 9f Experimental Psychology 
Vol. 55, No. 1, 1958 


MEDIATED VERBAL SIMILARITY AS A DETERMINANT 
OF THE GENERALIZATION OF A 
CONDITIONED GSR! 


LAURA W. PHILLIPS 


University of California 


The generalization or “spread” of 
CR’s to stimuli on the same dimension 
as the CS has been studied exten- 
sively. With human Ss, generaliza- 
tion along such dimensions as tonal 
frequency (7), tonal intensity (8), 
and spatial distances along the skin 
(1) has been investigated. Generali- 
zation of the salivary CR along verbal! 
dimensions has been explored by 
Razran (12), and the same verbal! 
dimensions were employed in a study 
by Riess (13), using the conditioned 
galvanic skin response (GSR). Both 
investigators found that the CR 
generalized to meaningful synonyms, 


e.g., from “‘urn”’ to “‘vase,”’ more than 


to phonetographically similar words 
such as from “urn” to “‘earn.”” 


Cofer and Foley have applied 
Hull’s principles of primary and 
secondary generalization in an analy- 
sis of language behavior (2). They 
have demonstrated homophone, syno- 
nym, and antonym generalization by 
preliminary training with lists of 
words related to a test list along these 
dimensions (3, 4). However, the 
scaling of verbal similarity employed 
by these investigators suffers from 
methodological difficulties. First, it 
is difficult to isolate a single verbal 

!'This research represents a major portion 
of a thesis submitted in partial fulfillment 
of the requirements for the degree of Master of 
Arts in the Department of Psychology of the 
University of California in 1953. The writer 
wishes to express her appreciation to Professors 
Leo J. Postman and Edwin E. Ghiselli for 
guidance and advice in this research, and to 


Professor Mark R. Rosenzweig for technica! 
assistance in the use of the Garceau Clinical 


Dermohmeter and in evaluation of the readings 


dimension. For example, a_ given 
synonym dimension is likely to be 
confounded with other synonym 
dimensions as well as with various 
homophone, pseudo-homophone and 
antonym dimensions, to mention only 
a few possibilities (2, 3, 4). This 
method of scaling encounters further 
problems in connection with the 
determination of equal scale intervals. 

To overcome these difficulties, it 
seems necessary to choose verbal 
stimuli which are initially devoid of 
meaning and to induce varying de- 
grees of similarity among them under 
controlled conditions of associative 
learning. The degree of similarity 
can then be defined and measured 
with reference to the conditions of 
preliminary training. 

The purpose of the present study 
was to investigate the generalization 
of a CR along an experimentally 
produced dimension of verbal simi- 
larity. ‘The method chosen for pro- 
ducing varying degrees of similarity 
among a series of verbal stimuli was 
that of establishing associative con- 
nections between nonsense words and 
points along a well-established sensory 
dimension (in this case, brightness). 
When associations between members 
of a series of verbal stimuli and 
members of a series of brightness 
stimuli have been established, it is 
assumed that an associative relation- 
ship will exist among the words 
themselves, mediated by the associa- 
tions between the words and different 
positions along the sensory scale. 
Such a mediated similarity scale 





MEDIATED VERBAL SIMILARITY 57 


would be manifest by Ss’ subsequent 
responses under relevant 
ditions. 

The experiment attempted to an- 
swer two questions: (a) Can a scale 
of mediated verbal similarity be 
established? (b) What is the nature 
of the generalization gradient as a 
function of such a similarity scale? 


test con- 


\letTHop 
Materials 


The stimuli used were 
Turkish (14) 
cards and presented in random orders; (b 
five Munsell grays f the values NI/, N3 - N5 
N7/, and N9/ respectively from dark to light 
(10)? The five constituted a 
series of 100 JND’s 
positions. Three 
sets of the five grays were cut into I} in. circles 
18-in. white cards 
The values of the grays were ordered differently 
m each of the white cards according to a re- 


(a) five seven-letter 


words typed on 35-in. by 3-in 


\ alues c h sen 


equal sense distances, 


separating successive scale 


and mounted on 6-in. by 


stricted randomization, 1.¢c., they never appeared 
in the These three 
it possible to vary the 
positions of the different 
trial 


order of their scale values 
displays made spatial 
grays from trial to 
For purposes of experimental training, each 
of the Turkish words was paired with one of 
The 


were varied from 


the gray circles particular word-color 
StwoS Five 


f paired associates were used 


combinations 
different sets 
in rotation 


Procedure 


The experimental period was divided into 
three parts: a training devoted to 
establishing associative connections between 
the words and the points on the brightness 
dimension; a conditioning session in which one 
of the words was used as a CS and the other 
words were used in tests for generalization; and, 
finally, a short association test based on the 
previously learned words. 

Training session.—The Ss 


session 


were instructed 


to learn the paired associates as a memory task, 


?'The author wishes to express appreciation 
to Professor Gordon L. Walls of the Optometry 
Department of the University of California 
for his expert advice in choosing the values of 
the Munsell grays and for preparing the color 
circles used in the experiment. 


and to pronounce the words as though they 
were English. The training procedure began 
with an initial presentation of the paired 
stimuli in the following manner: E showed 
each word card to S once and pointed to the 
gray circle paired with it while S pronounced 
the word. Paired-associate training trials were 
out as follows: (a) E pointed to a 
S was required to pronounce the 
word which had been previously presented with 


then carried 
circle while 


that color; (b) a word card was shown to S, 
he pronounced the 
to point 
circle 


word and was required 
simultaneously to the correct color 
The five words or colors were used as 
above 
order 


associations 


stimuli or responses respectively in the 
manner within each trial in 
to equalize the strengths of the 
in both directions, i.e 
word. A trial presentation 
f the five pairs in the manner described under 
a) and (b 

\ 3-sec. interval was allowed between the 
presentation of the 


randomly 


word-color, and color- 
consisted of the 


colors and S's 
If S did not respond within 3 sec., 
E gave the response by indication, i.e., by point- 
ing to a color 


words or 
response 
showing a word card, and S 
repeated it by pronouncing the word and point- 
ing to the color. The E confirmed or corrected 
each response by indication, and S repeated 
irrection. An interval of 5 sec. elapsed 
between S's response and the next 

\ 10-sec 
wder to 
and word-card orders 


each c 
stimulus 
presentation intertrial interval was 
allowed in change the color-display 
The trials were continued until S learned to 


After 


5-min. 


a criterion of three perfect recitations 
S was given a 
Following the rest, training 
of 100% overlearning 
f trials up to, but not 
including, the three criterial trials) in order to 
a high degree of between 
the words and colors 


criterion was reached, 
rest. interval. 
continued to the point 
(double the number 


insure association 


Conditioning session—After completion of 
the training session, a screen was placed on the 
table between S and / 
sented 


The words were pre- 
through a cut-out in the screen. A 

Clinical Dermohmeter was 
measure the skin resistance 


Garceau used to 
Electrodes were 
placed on the palms and held firmly in place 
with adhesive tape. The UCS was a pure 
tone of 300 eps, at 80 db sensation level. This 
tone had been tested on S prior to the beginning 
of the 
sensitivity 


training session in order to ascertain 
of his GSR. The tone was 
ficiently loud to elicit a GSR from all Ss. 
Pre-conditioning levels of GSR were deter- 
mined for each S by presenting each of the 
previously-learned words twice in random orders. 
\ plain, white card, of the same dimensions 


suf- 








58 


as those on which the words were presented, 
was also presented without reinforcement at 
unequal intervals throughout all the condi- 
tioning and test trials in order to extinguish 
any response to the card as such. 

Twelve reinforced trials were given, in which 
the CS (the N1/ word, dark extreme of scale) 
was paired with the UCS. The interval be- 
tween presentation of CS and UCS was 3 sec., 
and an interval of approximately 10 sec. inter- 
vened between presentations of the CS—UCS 
pair. (This interval varied around 10 sec. due to 
variation in the time taken for the Dermohmeter 
needle to return to a zero reading after deflec- 
tion.) The CS was then presented twice with- 
out the tone to test for conditioning, after which 
each of the other four words in the series was 
presented twice in two different random orders 
to test for generalization. The test stimuli 
were presented at the same rate as the CS-UCS 
pair during conditioning. 

Association test—After the conditioning 
session was completed and electrodes, earphones 
and screen removed, an association test was 
given in order to get an independent test of the 
establishment of a similarity scale. Each of 
the words was shown to S for 2 sec., and he was 
instructed to respond immediately with any 
other word from the training series which the 
given cue word suggested. 


Subjects 


Twenty-one University of California freshmen 
served as Ss. They were all naive with respect 
to the purpose of the experiment, and none 
had any familiarity with the Turkish language 


RESULTS 


Paired-associate learning.—Mean 
trials to learn were computed on 


TABLE 1 


Awna.ysis oF Post-ConbDITIONING 
IncREMENTS IN GSR’s (Loc 
ConpucTANcE CHANGES) 


Mean 
Difference 
(over Pre- 

cond. Level) | 


| 
| 


Measure 





CS 
White card 








CS minus 
white card 


LAURA W 


. PHILLIPS 


the basis of number of trials to and 
including the last trial on which an 
error was made. The three criterion 
trials were not counted as trials to 
learn. Mean trials to learn all words 
to the criterion of three perfect 
recitations was 13.02 (SD = 7.17). 
There was a significant difference 
between words paired with different 
scale postions as indicated by an F 
of 11.57, P< Ol. The smallest 
mean trials to learn were those yielded 
by words associated with NI/ and 
N9/. The Tukey gap test (15) was 
used to determine significant breaks 
between the means of trials to learn 
words paired with the different scale 
positions. The differences between 
the means of the words associated 
with the two colors representing the 
end positions of the brightness scale 
from those of the three middle posi- 
tions was significant beyond the .01 
level. Hence, the easiest words to 
learn were those paired with either 
end of the brightness scale This 
finding cannot be attributed to the 
relative difficulty of the words per se, 
siace they were rotated from S to 
S with respect to their color associates 

Conditioning.—The scores used in 
this analysis were obtained from the 
Dermohmeter readings in the follow- 
ing way: immediately prior to the 
exposure of S to a word (CS and 
test words alike) the resistance level 
of S was recorded in ohms. The 
stimulus was then presented, and 
S’s GSR was measured in terms of 
the peak deflection from the initial 
resistance level. In accordance with 
the findings of Haggard (6) on selec- 
tion of appropriate GSR measures 
for use in analysis of variance tests, 
log conductance changes were com- 
puted for each response. 

Table 1 gives the analysis of mean 
differences between pre- and _ post- 
conditioning levels of response to the 








MEDIATED VERBAL SIMILARITY 59 





Post-Conditioning increments in GSR 
Log Conductance Change 





+ 





} 
N3/ N5/ N7/ 
Scale Positions of Stimulus Words 





Fic. 1. Generalization of the response con- 
ditioned to the word previously associated 


with NI/. 


CS word and to the neutral white 
card. The ¢ of 9.97 for the CS 
demonstrates that a highly significant 
degree of conditioning was established. 
The mean difference of .23 between 
pre- and post-conditioning response 
levels to the white card does not 
reach significance. Hence, generalized 
GSR’s to a neutral stimulus associated 
with the CS which may have devel- 
oped during conditioning were not 
present to a significant degree during 
the test for generalization. Finally, 
the mean difference of 1.00 between 
the post-conditioning increments in 
GSR to CS and white card is sig- 
nificant beyond .O1. From these find- 
ings it is concluded that the GSR 
was differentially elicited by the 
word used as CS, and did not consti- 
tute a generalized response to the 
total experimental situation. 

The pre-conditioning GSR’s to the 
five words show no significant varia- 
tion as a function of scale position, 
(F = 1.12). However, as shown in 
Fig. 1, a trend appears in the general- 
ized responses tested after condition- 
ing of the word associated with Nl 
(dark extreme of scale). Table 2 
gives the results of the analysis of 


variance of differences between the 
pre-conditioning means and _ post- 
conditioning means for the five stimu- 
lus positions. 

In order to analyze the shape of 
the generalization gradient plotted in 
Fig. 1, the method of orthogonal 
polynomials, described by Grant (5), 
was used. The over-all trend of the 
responses as a function of the medi- 
ated-similarity scale was significant 
beyond the .O1 level. The negative 
linear, and positive quadratic com- 
ponents are each significant beyond 
the .O1 level. Hence, the shape of 
the gradient is concave upward. 
Since pf: is positive, the significance 
of the quadratic component is heavily 
weighted by the generalized response 
to the word associated with the N9/ 
position. 


TABLE 2 


Trenp Anatysis oF Post-ConpITIONING 
Increments 1n GSR’s (Loc 
Conpuctance CHANGES) 


Error 


Source MS Term 


4) 1.022} C 


\. Over-all trend 


1.492 | 


a. Linear l 


b. Quadratic 


c. Cubic 





d. Quartic 
B. Between indi- 
vidual means 
C. Between indi- 
vidual trends 


20| 1.957] C 


a. Linear 


b. Quadratic 


c. Cubic 


d. Quartic 











D. Total 











60 


Association test—If Ss were to give their 
associations by choosing at random from the 
four other Turkish words, the probabilities 
of an adjacent association corresponding to 
each of the brightness-scale positions are 
N1/, 4; N3/, 4; N5/, 4; N7/, 4; N9/, 4. The 
presentation of each of the five words to each 
S allows five opportunities per S for an adjacent 
association to occur, with probabilities as above 
Hence, by chance, there should be an average 
of two adjacent associations per S. 

However, if a scale of similarity has been 
established among the words as a result of the 
paired-associate training with the brightness 
dimension, one should expect significantly 
more than chance adjacent associations. 

The mean of the obtained adjacent asso 
ciations was 2.52 the fre- 
quency distribution of adjacent associations was 
reasonably symmetrical, a one-tail f¢ test was 
used to test the hypothesis that the obtained 
mean was not greater than 2. Since t = 2.94, 
(P < .005), it is concluded that Ss responded 
to the cues with another word from an adjacent 
part of the scale significantly more than by 
chance. Hence, the association results indicate 
a similarity relationship among the five words 
selected from an unfamiliar language 


(am = .18). Since 


Discussion 


The results of this experiment have 
demonstrated: (a) the establishment 
of a scale of mediated verbal similarity; 
and (4) that a CR generalizes over the 
major portion of such an experimentally- 
produced dimension as a_ decreasing 
function of the differences from the CS. 

According to Hull (9, p. 184f), a 
sensory continuum, such as the bright- 
ness dimension used in this experiment, 
should generate an “afferent generaliza- 
tion continuum” during conditioning. 
Further, the generalized reaction po- 
tential along a primary stimulus dimen- 
‘sion should constitute a decreasing 
function of the distances from the CS. 
Hull’s analysis of “secondary stimulus 
generalization” may be applied to the 
paired-associate training and subsequent 
conditioning of Ss in this experiment. 

During the paired-associate training, 
when a Turkish word and a gray circle 
are presented to S, his response is a 
complex act; i.e., visual perception of 
several shades of gray, discrimination 
of the “correct” shade, pronunciation 








LAURA W 


. PHILLIPS 


of the word, and simultaneously pointing 
to the discriminated shade. (For the 
sake of simplicity of exposition, the 
color-word direction of association will 


be omitted, but the same _ theoretical 
applications hold for both.) “‘Feed- 
back” stimuli then arise from this 


complex of overt and mediating responses 
made by the S to the stimuli on the 
primary sensory dimension. Hypothet- 
ically, such stimuli might function as 
representational or symbolic cues to 
which can be conditioned, 
such as those which might be conceived 
to arise from the judgmental act of 
choosing the correct shade of gray to 
pair with a given word. 

One further assumption is needed: 
The hypothetical mediating responses 
made to any two neighboring points 
on the sensory scale are more similar 
than are those made to any two removed 
points. In more concrete terms, visual 
presentation of the word associated 
with NI will elicit, together with its 
overt responses (pronunciation of the 
word, pointing to the color) several 
mediating responses, (e.g., “‘black,” 
“very dark,” etc.) Similarly, presenta- 
tion of the words associated with N3), 
N5/, N7/, and N9,, will elicit different 
and mediating responses. How- 
ever, the mediating responses elicited 
by the N3° word will be more similar 
to those elicited by the N1/ word than 
will those elicited the N5 
word. 

Osgood (11, p. 704) proposes such a 
‘mechanism for degrees of similarity 
among mediation processes,’ and cites 
the lack of this assumption as a serious 
flaw in the theoretical formulation of 
Cofer and Foley (2). Moreover, this 
assumption may be justified in the 
present experiment by the conditions 
of training; i.e., the sensory stimuli 
consisted of a pre-established differential 
series. 

Given these assumptions about what 
happens in the process of over-learning 
the paired associates during the training 
session, the results of the subsequent 
association test follow: The cue word 
will elicit mediating responses which 


responses 


overt 


by, say, 








MEDIATED VERBAL SIMILARITY 61 


are more similar to the mediating re- 
sponses elicited by the word in the next 
scale position than to those of further 
removed words. These mediating re- 
sponses will then produce stimuli which 
have become associated with the cue 
word; similar stimuli will have become 
associated with the next scale-position 
word and the resulting generalization 
gradients will overlap at a high level 
of reaction potential, as formulated by 
Hull (9, p. 195). Hence, the most 
probable word to be used as a response 
to any given cue will be the one associ- 
ated with the next sensory-scale position. 

Finally, when S’s GSR is conditioned 
to a word (in this case, the one associ- 
ated with N1/), the mediating responses 
to this word also become conditioned. 
Since the mediating processes of the 
other words are differentially similar 
to those of N11’, the conditioned GSR 
will generalize to these words as a func- 
tion of their similarity. The further 
removed the words are along the scale, 
the less similar to N1/ (the CS) will 
be their mediation processes. Hence 
on subsequent tests for generalization, 
the generalized GSR will be a decreasing 
function of the scale distance from N1/. 

But how may we account for the rise 
in generalized GSR at the extreme 
opposite of the scale (see Fig. 1)? So 
far, the results of this experiment 
have paralleled Osgood’s formulation of 
semantic generalization, according to 
which generalized reaction to similar 
mediators will be facilitated and gen- 
eralized reaction to opposites will be 
inhibited (11, p. 708). However, he 
also presents evidence that responses 
to opposites (as frequently observed 
in free association tests) cannot be 
based upon semantic mediation proc- 
esses, but are due to highly overlearned 
verbal habits (11, pp. 709-711). There is 
also evidence to support this view from 
the paired-associate training data of 
the present experiment, viz. the “‘anchor- 
ing effect’ at the two extremes of the 
scale exhibited in significantly lower 
mean trials to learn words paired with 
N1/ and N9/. Since these two grays 
were often referred to as “black” and 


“‘white” by Ss, it may well be that they 
utilized a strong “‘black-white”’ response 
habit to delineate the scale during the 
paired-associate training. It may be 
concluded, then, that the generalized 
GSR received a “boost” at the end of 
the mediated scale by virtue of summa- 
tion with an already stronger response 
at this point. 


SUMMARY 


designed to establish 
a. scale of verbal similarity and to test the 
generalization of a CR over this scale 

Five Turkish words were paired with five 
neutral Munsell colors constituting a brightness 
scale. Twenty-one Ss were trained to associate 
words and colors to a criterion of 100% over- 
learning beyond three perfect 
the pairs 


\n experiment was 


recitations of 
The Ss were then conditioned to 
the word which had previously been learned 
as a paired associate to the dark extreme of the 
brightness scale. For conditioning, a loud, 
pure tone was paired with the stimulus word, 
and GSR was measured. The CS word and 
the other words in the series were then tested 
for GSR without the tone. After conditioning 
the words were presented randomly and first 
associations were recorded. 

The results indicate that a mediated verbal 
similarity scale was established, and that the 
generalization gradient over the major part of 
this dimension decreasing function of 
the distance of the test stimulus from the CS 

These 


was a 


results were discussed in terms of 


(a) Hull's principles of primary and secondary 
generalization, and (b) Osgood’s application of 
these principles to language behavior. 


REFERENCES 


Bass, M. J., & Hutt, C. L. The irradia- 
tion of a tactile conditioned reflex in man. 
Fs comp. Psyc hol., 1934, 17, 47-65. 

Corer, C. N., & Forey, J. P. Mediated 
generalization and the interpretation 
of verbal behavior: I. Prolegomena 
Psychol. Rev., 1942, 49, 519-540. 

. Corer, C. N., Janis, M., & Rowexz, M. M. 
Mediated generalization and the inter- 
pretation of verbal behavior: III. Experi- 
mental study of antonym gradients. 
J..exp. Psychol., 1943, 32, 266-269. 

. Forey, J. P., & Corer, C. N. Mediated 
generalization and the interpretation of 
verbal behavior: II. Experimental study 


of certain homophone and synonym 








62 LAURA W. 


gradients. 
168-175. 

5. Grant, D. A. Analysis-of-variance tests 
in the analysis and comparison of curves. 
Psychol. Bull., 1956, 53, 141-154. 

6. Haccarp, E. A. On the application of 
analysis of variance to GSR data: I. 
The selection of an appropriate measure. 
J. exp. Psychol., 1949, 39, 378-392. 

7. Hovtann, C. I. The generalization of 
conditioned responses: I. The sensory 
generalization of conditioned responses 
with varying frequencies of tone. /. 
gen. Psychol., 1937, 17, 125-128. 

8. Hovtanp, C. I. The generalization of 
conditioned responses: II. The sensory 
generalization of conditioned responses 
with varying intensities of tone. /. 
genet. Psychol., 1937, 51, 279-291. 

9. Hutt, C. L. Principles of behavior. New 
York: Appleton-Century-Crofts, 1943. 


J. exp. Psychol., 1943, 32, 


PHILLIPS 


10. Nickerson, D., & Newnarr, S. M. A 


11. 


12 


. 


. Tuxey, J. W. 


psychological color solid. 
Amer., 1943, 33, 419-422. 

Oscoov, C. E. Method and theory in 
experimental psychology. New York: Ox- 
ford Univer. Press, 1953. 

Razran, G.H.S. A quantitative study of 
meaning by a conditioned salivary tech- 
nique (semantic conditioning). 
1939, 90, 89-90. 


J. Opt. Soe. 


Science, 


. Riess, B. F. Semantic conditioning in- 
volving the galvanic skin reflex. /. 
exp. Psychol., 1940, 26, 238-240. 

. Sotomon, R. L., & Postman, L. Fre 


quency of usage as a determinant of 
recognition thresholds for words. J] 
exp. Psychol., 1952, 43, 195-201 
Comparing 
means in the analysis of 
Biometrics, 1949, 5, 99-114. 


individual 
variance. 


(Received January 14, 1957) 











Journal of Experimental Psychology 
Vol. 55, No. 1, 1958 


TRANSFER AFTER TRAINING WITH SINGLE 
VERSUS MULTIPLE TASKS! 


CARL P. DUNCAN 


Northwestern University 


Studies of learning to learn (e.g., 6) 
and learning set (e.g., 4) show clearly 
that performance improves during 
practice on a series of similar tasks. 
However, most such experiments have 
not compared performance on a trans- 
fer task of the group that had prac- 
ticed on the series of tasks with 
another group that had been given 
the same total amount of practice 
on a single task. Because of this, 
it cannot be determined how much 
of the increasing positive transfer 
called learning to learn or learning 
set is due to experience with a variety 
of tasks and how much is due to 
amount of practice per se. Since 
positive transfer varies directly with 
amount of practice even on a single 
task (9), it seems necessary to de- 
termine if anything is learned from 
practice on a series of tasks that is 
not learned from an equal amount of 
practice on one task. This is the 
problem of the present study. 


\ design in which at least two groups are 
given the same total amount of training, one on 
a single task (constant training), the other on a 
series of tasks (varied training), with both groups 
tested on the same transfer task, has apparently 
been used in only two previous studies. Dashiell 
(2), using code substitution, found that during 
training the constant group (same code every 


'This research was performed under Con- 
tract No. AF 33(616)-308 between Northwestern 
University and the Psychology Branch of the 
Aero Medical Laboratory, Directorate of Re- 
search, Wright Air Development Center. I am 
much indebted to the following people for super- 
visory work and for advice: Drs. Benton J. 
Underwood, Ross L. Morgan, Edward Schwartz, 
John C. Jahnke, and John W. Cotton. Space 
limitations prevent giving credit to the large 
group of people who assisted in collecting the 
data. 








day) showed considerable improvement in its 
one task, while the varied group (new code every 
day) improved only slightly. However, on 
transferring to a new code, the varied group 
performed better. 

The most thorough study was done by Crafts 
(1). In his several experiments he found that 
in all types of tasks varied training produced 
superior transfer only when some characteristic 
of a series of tasks remained unchanged (“com- 
mon element”) over all (including transfer) 
tasks, while other characteristics varied. When‘ 
the common element was eliminated, varied 
training produced no better transfer. 


In spite of Crafts’ suggestion that 
varied training is advantageous only 
when common elements are present, 
Dashiell’s findings indicate that some 
type of habit that facilitates positive 
transfer can be developed when there 
is only a general similarity among 
tasks. ‘This approach, i.e., training 
with tasks having only a nonspecific 
similarity, is used in the present study. 

In this paper varied training is 
treated as a continuum, with constant 
training as one extreme, because 
certain degrees of varied training may 
be advantageous while other degrees 
may not. Degree of varied training 
is here defined in terms of the number 
of tasks (variations) introduced dur- 
ing training. 

The other major variable is the 
total amount of practice or training 
given. This variable is necessary, 
since if total practice were equal for 
all degrees of varied training, in- 
creasing the number of training tasks 
means decreasing the amount of 
practice on each. Since it is not 
known whether varied training can 
be treated in terms of variation per se, 
or whether there is an interaction 








64 CARL P 


between the number of variations 
and the amount of practice on each, 
total amount of practice, and there- 
fore amount of practice on each train- 
ing variation, will also be manipulated 
to permit testing for this interaction. 


METHOD 


Apparatus—The manipulandum, operated 
from a sitting position, was a lever, 24 in. long, 
the top, free end of which could be moved into 
any one of 13 slots cut 1 in. deep and 1 in. apart 
in a steel plate. The slots were arranged in a 
semicircle concave to S. A red jewel light was 
immediately above, a microswitch immediately 
below, each slot. Movement of the lever into 
any slot depressed the microswitch and flashed 
on the light above whichever slot was correct 
for the stimulus showing, thus informing S 
which slot was correct immediately after each 
response. The slots were also numbered from 1 
to 13, from left to right, with a large numeral 
printed above each jewel light. Immediately 
above the lights and numerals was the aperture of 
a memory drum, the front surface of which fitted 
into a hole in a large screen which prevented S 
from seeing E or the rest of the apparatus. 

Behind the screen E faced a panel of two rows 
of 13 lights each which were numbered in each 
row. The light on in the top row indicated to 
E which slot was correct at the moment; the 
light on in the bottom row indicated which slot 
S’s lever had entered. Recording of correct 
and incorrect responses was done manually by 
E. Aset of 13 telephone jacks permitted pairing 
of a set of stimuli with the slots in any order. 

Tasks.—Sinte responses were always move- 
ments of the lever into slots, a task is defined as a 
set of 13 stimuli. Thus, varied training was 
accomplished primarily by training with different 


TABLE 1 
ConpITIONs oF THE EXPERIMENT 
(Entries are the number of trials 

on each training task.) 


Days of Training 





Number of 
Training Tasks | a 

| 2 5 10 

10 | 4 10 | 20 

5 8 20 40 

2 20 50 100 

1 40 100 200 

10R 4 1 | 20 

Total trials | 40 


100 ~=—200 


. DUNCAN 


sets of stimuli (although it will be seen later that 
another method of varied training received some 
attention). 

There were 10 tasks used only during training, 
and two tasks used only during the two transfer 
tests. Each training task consisted of 13 
relatively meaningless forms. Each stimulus 
form within a task was produced by drawing 
elaborations (surplus lines) on a single “theme” 
or basic figure, such as a circle, a letter, etc. A 
different theme was used for each task, thus no 
stimulus in any task had any obvious similarity 
to a stimulus in another task. Because of this, 
and because stimuli were assigned to slots 
haphazardly, it will be assumed that there was 
little or no transfer among tasks 
specific stimulus generalization 

Tasks used for the two transfer tests were: 
(a) H figures, 13 forms built on a theme (capital 
H) not used for any training task, and (bd) 
nonsense syllables of low association value and 
low intratask similarity. Thus, transfer was 
tested both with a task (Hi figures) that had 
relatively high, and a task (nonsense syllables) 
that had relatively low, over-all similarity to the 
training tasks. 

All sets of training and transfer stimuli were 
mounted on tapes cut to fit the memory drum. 
To prevent serial learning, the 13 stimuli in each 
task were mounted in 12 different orders in a 
single vertical column on the tape. Since the 
tape was an endless belt, there was no apparent 
beginning or end of a task. There were no 
rests between trials, but after every 39 stimulus 
presentations (three trials or three orders), a 
blank space on the tape appeared for 4 sec. The 
stimuli were machine paced, each appearing 
for 4 sec. 

Conditions.—Manipulation of the number-of- 
training tasks variable was accomplished by) 
training different groups with 1, 2, 5, or 10 tasks. 
The I-task condition is defined as constant 
training and can be considered the control for 
varied training provided by the 2-, 5-, and 10- 
task conditions. There is no control group for 
transfer per se, i.e., a group tested on transfer 
tasks without any training. 

Amount of practice was varied by giving 2, 
5, or 10 days of training at the rate of 20 trials 
per day. Thus, the l-task group that was 
trained for 10 days received 200 training trials 
on its one task; the 10-task group trained for 
two days received only four trials on each task, 
etc. Conditions of the experiment, and number 
of trials on each training task for each condition, 
are shown in Table 1. 

The re-paired task condition —When Ss are 
trained with different sets of stimuli they not 
only receive practice with a greater variety of 
stimuli than do Ss in the constant-training con- 


based on 














TRANSFER WITH SINGLE AND MULTIPLE TASKS 65 


dition, but they also receive more practice at 
starting from scratch and gradually acquiring 
S-R associations. As a partial check on the 
importance of this factor, i.e., experience at 
forming new associations, another method of 
varied training was employed in which only one 
set of stimuli was used throughout training and 
different “tasks” were provided by re-pairing 
the stimuli with the slots in completely different 
combinations. The only re-paired-task groups 
run (indicated by the row labeled 10R im Table 
1) were three (with 2, 5, or 10 days of training) 
that were trained with 10 completely different 
re-pairings of the stimuli and responses. 

Notation.—Each of the 15 groups indicated in 
Table 1 will be denoted by two numbers, the 
first number indicating number of days of 
training, the second indicating number of train- 
ing tasks, thus, Group 10-10, Group 5—10R, etc. 

Assignment of training tasks.—Each S in the 
three 1-task and the three 1OR groups used only 
one set of stimuli throughout training; in these 
groups the 10 training tasks (sets of stimuli) 
were assigned to Ss in turn. For each S in the 
10R groups, the 10 re-pairings of the stimuli 
and responses were such that no stimulus was 
ever paired with the same slot more than once. 

For the 10-task groups, the 10 training tasks 
were arranged in 10 completely different orders 
and the orders were assigned to Ss in turn. For 
the 2-task and 5-task groups, the training tasks 
chosen, and the order in which they were prac- 
ticed, were such that each of the 10 available 
tasks was used equally often, and about equally 
often in each position in an order. 

Transfer tests —The first transfer test was 
given 24 hr. after completion of training, the 
second 24 hr. later. Each test consisted of 
20 trials? Both transfer tasks were used in 
both tests by splitting each of the 15 main 
groups of Ssinto two subgroups. One set of sub- 
groups was tested with nonsense syllables on the 
first test, with H figures on the second; for the 
other set of subgroups the order was reversed. 

All Ss worked a 5-day week, so for groups 
trained 5 days or 10 days, but not for groups 
- trained 2 days, a weekend intervened between 
end of training and first transfer test. 

Subjects and procedure.—The Ss were 600 
male and female undergraduates at North- 


* Most of the Ss in Groups 10-10, 10-1, and 
10-10R had been given 21 trials on all training 
and transfer days before it was finally decided 
to use 20 trials per day as the basic unit. In 
the course of final matching of all groups about 
one-third of the Ss in these groups were replaced 
with Ss given 20 trials a day. Examination of 


some of the data showed no difference between 
20- and 21-trial Ss. 





western University, paid for their services. 
There were 40 Ss in each of the 15 main groups, 
20 Ss in each of the 30 subgroups. 

Instructions to S described the nature of the 
learning task and emphasized making as many 
correct responses as possible; the latter point 
was mentioned at beginning of practice each 
day. Instructions also specified that it was 
necessary to make one, and only one, response 
every time a stimulus appeared; thus, there is 
no independent error measure and the data 
are reported in terms of correct responses. 


RESULTS 


Comparability of groups.—Enough 
Ss were run to permit eventual 
matching of all 30 subgroups with 20 
Ss in each, matched on mean total 
correct responses on Trials 2-4 of the 
first training task. (Scores on Trial 
1 would be largely chance and some 
groups received only four trials per 
task.) The 30 matching means ranged 
from 6.22 to 6.87; standard errors 
ranged from .37 to .86. Although 
it is not known how adequate match- 
ing was for groups not given more 
than four trials on the first task, a 
check made with some groups given 
20 trials on that task showed that 
matching on Trials 2-4 produced 
groups that were not significantly 
different on total score over 20 trials. 
The correlation, based on 100 Ss, 
between scores on Trials 2-4 and 
Trials 2-20 was .46. 

Training.—Training data will not 
be presented in detail, but some points 
are worth noting. In groups trained 
for several days on one task (e.g., 
Group 10-1), a few Ss mastered the 
task (13 correct responses per trial) 
by the end of the first day, i-e., first 
20 trials, and all 40 Ss mastered the 
task by the end of Day 4. Thus, 
all Ss in two of the three constant- 
training groups (Groups 10-1 and 
5-1) had thoroughly mastered their 
one training task before being tested 
for transfer, but this was not true 











66 


CARL P. DUNCAN 























» 

13 = O-m——"© Group 10-10 (20 trials per tesx Ommmnn® Croup 10-19h (20 trials per tesx 
Om—"O© Group 5-10 (10 trials per tas« O———O Group 5-10h (10 trials per tes« 
fm———-O Group 2-10 (4 trials per tesk) G—S Croup 2-10R (& trials per tesa) 

12 be Gorcen OPirst 10 triels cf Group 10-10 =) O First 10 triels of Group 10-10R 

~J O-—<O First & trials of Group 10-10 Ow <0 Firs: & traale of Group 19-10P 
x " Qe--O Pirst & triels of Group 5-10 D--O First . trials of Group 5-10R 
& b— 
i 
&©& 10F 
wy 
Q 
m 9fF 
8 | 
5) tr 
& 
mK 6F 
S 
aos 
& 
8S 4 
= 
ZS 3fF 
¥ 
: 
lr 
re) 1 l l a 1 l L l 1 l lL l l l SSS a SS 
'2345 67890 #!§2345 67 «8 9 WO 


TRAINING TASKS 


Fic. 1. 


Performance during training of the three groups trained with 10 different sets of stimuli, 


on the left; and of the three groups trained with 10 re-pairings of the same stimuli, on the right 


for Group 2-1. This difference will 
show up in transfer performance. 
Groups trained with several tasks 
showed improvement in performance 
on successive tasks, as shown in 
Fig. 1. The six groups depicted in 
Fig. 1 are the three trained with 10 
different tasks (left side of Fig. 1), 
and the three trained with 10 re- 
pairings of the same task (right side), 
for 2, 5, or 10 days. Each symbol 
on a solid-line curve in Fig. 1 repre- 
sents mean correct responses per 
trial over all trials given that group on 
a task (see Table 1). Thus vertical 


differences among solid-line curves 
in either side of Fig. 1 are of no 
significance, since the points are 


based on different numbers of trials. 
Explanation of the dashed and dotted 
lines in Fig. 1 is given in the legend. 

Figure 1 shows that all groups 


improved from the first to tenth task. 
Even the apparently small gain made 
by Group 2-10R is highly significant; 
t for related measures between the 
first and tenth tasks was 6.89. It 
cannot be assumed that re-paired 
groups learned the same thing as 
different-task groups; the transfer 
data will show that they did not. 
Comparison of any one of the dashed 
or dotted curves in Fig. 1 with the 
solid-line curve nearest to it gives 
some indication of whether or not 
the intertask improvement exhibited 
in the first 4 (or 10) trials on each 
task is greater in a group given more 
than + (or 10) trials per task than it is 
in a group given only 4 (or 10) trials 
per task. These comparisons were 
not analyzed statistically because 
there are obvious effects due to 


sequencing of tasks over days, but 








PIRANSFER WITH SINGLE AND MULTIPLE 


the curves do suggest that intertask 
improvement as measured in terms of 





PASKS 67 


TABLE 2 


ComPARISON OF PERFORMANCE ON THE First 
performance on an early block of trials Transrex Test or THE Grours 
on each task is greater when each Trainep witht I, 2, 5, or 
task is practiced beyond the number reer sone 
of trials constituting the block. a e pty 1s 
Transfer—The data of major in- — — 
terest are performances of groups Days (D. 2 | 34707.08 | 26.61° 
trained with different tasks and 12sks(T)_. BP 5 Rs 
Transfer (Tr 1 | 27403.53 | 21.01 
tested on two transfer tasks counter- Dx T f 818.34 
balanced over two transfer tests. ho Bt : — . 2.0 
Transfer data for groups trained with >» TX’ é 212.44 
re-paired tasks will be analyzed sepa- Within $56 | 1304.20 
rately. All analyses to be presented a 


are: based on total correct responses 
over all 20 trials given on a transfer 
task as the score for each S. 

First transfer test. Since each major 
group was split into two subgroups, 
one tested with nonsense syllables, 


for three variables: amount of train- 
ing, number of training tasks, and 
transfer task. The analysis of vari- 
ance based on these 24 subgroups 
(omitting the 6 subgroups trained 














the other with H figures, on the with re-paired tasks) is summarized in 
first test, the data were analyzed Table 2, where Days and Tasks 
190 — 
180 + 4 
WH 
o | 
= '70Fr ® 4 
S | 
q 
e @ 
e 60r 7 
$ | | 
w - o | | 
: | "7 
140 r 4 
~ | 
N | 
S 130 r 4 
= & 
ly 
= 120+ @i0 pars 
@5 DAYS 
A 2 DAYS 
NO F } 
0" = 1 Liygt | nl akg fool 
1 2 | 10 1OR } 2 5 10 OR 
TRAINING TASKS 
Fic. 2. Performance of all groups on the first transfer test, on the left; and on the second test, 


on the two transfer tasks is combined within each test. 


on the right. Performance 





68 CARL P. DUNCAN 


indicate the two training variables 
and Transfer indicates transfer task. 
The test for heterogeneity of variance 
gave x? = 34.22, which is not sig- 
nificant with 23 df. 

Table 2 indicates that none of the 
interactions between training vari- 
ables and the transfer-task variable 
(D X Tr., T X Tr., D X T X Tr.) 
was significant. It is therefore not 
necessary to report data for the two 
transfer tasks separately. In terms 
of over-all performance there was 
a significant difference between trans- 
fer tasks (F = 21.01); performance on 
H figures was lower. 

Both training variables had highly 
significant effects on transfer, as the 
Days and Tasks terms in Table 2 
show. These effects are shown in 
the left side of Fig. 2, where mean 
total correct responses is plotted as 
a function of number of training 
tasks, with amount of training as the 
parameter. Since performance on the 


two transfer tasks is combined, each 


point is based on 40 Ss. (The solid 
symbols in Fig. 2 show performance 
of re-paired groups, and will be dealt 
with later.) 

Taken together, Table 2 and Fig. 2 
show that transfer increased both as 
number of training tasks and as 
amount of training increased. Even 
though there is no true control group, 
it is highly probable that net transfer 
was positive. First, mean total cor- 
rect responses on the first training 
task was 125.51 (based on all Ss 
given at least 20 trials and with all 
10 sets of training stimuli repre- 
sented); no value in Fig. 2 is lower 
than this. Second, the transfer tasks 
were not easier than the combined 
training tasks. The best performance 
in the left side of Fig. 2 is by Group 
10-10, but even its performance on 
the easier transfer task (nonsense 
syllables) was lower than its own 


mean total score of 207.95 on the last 
(tenth) training task. Thus, per- 
formance on transfer tasks was higher 
than performance on the combined 
10 sets of stimuli making up the 
first training task very probably 
because of positive transfer, not 
merely because the transfer tasks 
were easier. 

Returning to Table 2, it can be 
seen that the interaction between 
training variables (D X T) is not 
significant. This is an important 
finding and will be discussed later. 

Since number of training tasks was 
significant, the effect on transfer of 
constant versus varied training can 
be determined by comparing the 
mean of the l-task (constant train- 
ing) group with means of the 2-, 5-, 
and 10-task groups. Because no in- 
teraction in Table 2 was significant, 
groups and subgroups differentiated 
on other variables were combined 
to yield means based on 120 Ss each. 
These means were 143.95, 154.88, 
158.85, and 164.23 for the 1-, 2-, 5-, 
and 10-task groups respectively. 
Comparison of the l-task mean with 
the other three values yielded ts of 
2.34, 3.19, and 4.34 respectively 
(standard error of difference obtained 
from Within Groups term in Table 2). 
Since all these values are significant 
at the 2% level or better, it is clear 
that all degrees of varied training 
produced better transfer than con- 
stant training. There is also some 
evidence that greater degrees of 
varied training yielded more transfer 
than lesser degrees; the 10-task mean 
is significantly higher than the 2-task 
mean (t = 2.00, P.05) Other com- 
parisons were not significant 

Means for groups given different 
amounts of training were 138.49, 
163.58, and 164.37 for the 2-, 5-, and 
10-day groups, respectively (all groups 
attributable to other variables com- 





TRANSFER WITH SINGLE AND MULTIPLE TASKS 


bined, 160 Ss per mean). The 2-day 
mean is significantly lower than 
either the 5-day (t = 6.21) or 10-day 
(t = 6.41) means; 5- and 10-day 
values are not significantly different. 
Not surprisingly, transfer increased, 
up to a certain point, as a function 
of amount of training. 

The results on the first transfer 
test of the six subgroups (three 
amounts of training, two transfer 
tasks) trained on 10 re-pairings of 
the same set of stimuli with responses 
involves the comparison of these Ss 
with the six constant-training (l- 
task) subgroups in the analysis sum- 
marized in Table 3 (where Tasks 
indicates the type of training). The 
test for heterogeneity of variance 
gave x? = 13.50, which is not sig- 
nificant with 11 df. 

Since none of the interactions in 
Table 3 between training variables 
and the transfer-task variable was 
significant, the solid symbols in the 
left half of Fig. 2 show performance 
of the three main re-paired groups 
with subgroups combined. Table 3 
also shows that Days (of training) 
and Transfer (task) were significant 
variables, and that the interaction 
of training variables (D X T) was not 
These results correspond to those 


TABLE 3 


ComPaRIsON OF PERFORMANCE ON THE First 
TRANSFER TEsT oF THE Groups 
TRAINED with Re-parrep StTimvuti 

AND THE ConsTANT-TRAINING Groups 


MS F 
| 21210.47 | 12.34" 
1131.01 
14492.61 | 
1286.02 | 
847.82 | 
495.92 


Transfer 8.43° 

DXT | 

D X Tr | 

TX Tr 

DXTXTr | | 630.61 

Within | 1718.98 
| 


*P = O01 


TABLE 4 


ANALysis oF THE Errect or Orper or 
PRESENTATION OF TRANSFER TASKS 
Over THe Two Transrer Tests 


25781.90 

1 | 83794.75 
478 | 478.50 
1 | 3099.60 


Ss within orders 478 2201.21 


P= Mi. 


found with different-tasks groups 
(Table 2). But the important term 
in Table 3 is the one for Tasks, which 
is not significant; there was no dif- 
ference in transfer among constant- 
training and re-paired groups. Thus, 
practice at associating stimuli with 
responses, with which re-paired groups 
had as much experience as did groups 
trained with 10 different sets of 
stimuli, cannot account for the dem- 
onstrated transfer advantage of train- 
ing with different tasks. 

Second transfer test.—In this test 
each subgroup was tested with what- 
ever transfer task (nonsense syllables 
or H figures) it had not practiced 
on the first test. Since there might 
be an effect of order of transfer tasks 
over tests, total correct responses of 
all four combinations of transfer test 
and transfer task (obtained by com- 
bining all groups attributable to 
training variables) were compared to 
permit evaluation of order. The 
analysis is summarized in Table 4 
(groups trained with re-paired stimuli 
are not included). 

In Table 4 the term for Order 
(tested against Ss within orders) is 
not significant; there is no evidence 
of an interaction due to order of 
transfer tasks over tests. Since the 
term for Task is highly significant 
(tested against Error), again indicat- 
ing that over-all performance was 





70 CARL P 


lower on H figures than on nonsense 
syllables, the fact that Order was 
not significant indicates that there 
was no differential transfer in going 
from easy to difficult or from difficult 
to easy transfer tasks. The signifi- 
cant term for Test (tested against 
Error) indicates that over-all per- 
formance was higher on the second 
test, presumably due to additional 
“training” provided by practice on 
the first test. Most. of the gain 
occurred in groups trained originally 
with only one task 

Analysis of groups trained with 
different tasks (performed as_ il- 
lustrated in Table 2) showed that, 
as on the first test, none of the inter- 
actions between the _ transfer-task 
variable and training variables was 
significant. Therefore, performance 
on the second test, plotted on the 
right in Fig. 2, is shown with sub- 
groups attributable to transfer tasks 
combined. Over-all difference in per- 
formance on the transfer tasks was, 
as usual, highly significant (F = 53.74, 
P = Ol, 1 and 456 df). 

Amount of training (Days) was 
still a highly significant variable on 
the second test: (F = 17.24, P = .O1, 
2 and 456 df). But the effect of 
training with different numbers of 
tasks was significant only at the 5° 
level (F = 3.82, where 3.83 is needed 
at the 1% level with 3 and 456 df). 
This reduced effect of the Tasks 
variable may be seen, although not 
too clearly, by comparing steepness 
of corresponding curves in the left 
and right sides of Fig. 2. 

As before, means of the l-, 2-, 5-, 
and 10-task groups were compared 
by t-tests. Both the 5- and 10-task 
groups were significantly superior 
to the l-task group at the 2% level 
or better (t's of 2.44 and 2.96, re- 
spectively), but the 
was not (t < 1.00). 


2-task group 
Thus, on the 


. DUNCAN 


second test only greater degrees of 
varied training produced superior 
transfer to constant training. 

Comparison of groups trained with 
re-paired stimuli with l-task groups 
was made by an analysis like that 
shown in Table 3. As on the first 
test, the only significant terms were 
those for Days of training and for 
Transfer task (both significant at the 
1% level); re-paired training pro- 
duced no better transfer than constant 
training (F tasks < 1.00). Perform- 
ance of re-paired groups on the second 
test is shown by solid symbols on the 
right side of Fig. 2. 


Discussion 


The major findings were: (a) transfer 
was a direct function of degree of varia- 
tion in training; (4) the relation between 
transfer and degree of variation in 
training was independent of amount of 


' training. 


The finding that transfer increased 
directly with increases in degree of 
variation in training supplies the answer 
to the question with which this study ts 
primarily concerned: varied training 
produced better transfer than constant 
training. Furthermore, this result was 
found for the case where total amount 
of practice was equal for both constant 
and varied training. When total prac- 
tice is equal, there is, of course, much 
more practice on the one task given in 
constant training than there is on any 
one of the tasks used in varied training, 
but, as the results show, this was not 
important. 

The advantage of varied over constant 
training was probably not merely the 
difference between positive transfer and 
zero transfer, between training and no 
training. Although there was no control 
group for transfer per se, there is, as 
shown earlier, every reason to think 
that there was net positive transfer 
from training to transfer tasks. This 
means that even constant training pro- 
duced some positive transfer, so the 











PFRANSFER WITH SINGLE 


superiority of varied training was meas- 
ured in terms of even larger amounts 
of positive transfer. 

The second major finding was that 
there was no interaction between degree 
of variation in training and amount of 
training. In other words, the advantage 
of varied over constant training was 
not affected by varying total amount of 
training (as long as total training, 
whatever its amount, was equal for 
both varied and constant training con- 
ditions). This fact, that there was no 
interaction between the two training 
variables, is one more indication that 
what is important in varied training is 
variation per se. In short, S learns 
something from being required to prac- 
tice with different sets of stimuli, and, 
as in most learning situations, the most 
important variable is the number of 


“trials” (sets of stimuli); the more 


trials, the more he learns. 

In asking what § learns by working 
with different sets of stimuli it should 
be noted that we are not dealing with 
transfer which is based on stimulus or 


response generalization in the usual 
sense, or with transfer based on stimulus 
predifferentiation. Rather, we are con- 
cerned with the kind of nonspecific 
habit or skill that is similar to what has 
been called learning to learn (6) or 
learning set (4), and that facilitates 
transfer among tasks which have over- 
all similarity but which lack easily 
identifiable dimensions of stimulus or 
response similarity. But curves of learn- 
ing to learn and learning set probably 
overestimate the kind of nonspecific 
habit with which the studies of Dashiell 
(2), Crafts (1), and the present study 
were concerned; the progressive im- 
provement such curves show is not 
entirely due to varied training. 

In attempting to infer something 
about this nonspecific skill, or generalized 
ability to learn, the data from groups 
trained with re-paired stimuli are useful. 
These groups, like groups trained with 
different sets of stimuli, had to start 
from’ scratch on every new “task” 
(re-pairing), and also showed consider- 
able improvement from their first to 


AND MULTIPLE TASKS 71 


their tenth re-pairing. Yet, unlike dif- 
ferent-task Ss they showed no better 
transfer than Ss given constant training. 
(It should be noted that the transfer 
tasks involved new stimuli, and so may 
not be the most appropriate tasks for 
testing re-paired groups.) Thus, the 
generalized learning skill developed by 
Ss trained with different sets of stimuli 
is not due to response differentiation or 
other factors on the response side, nor to 
experience at associating stimuli and 
responses. All of these factors should 
be the same from training with either 
different stimuli or re-paired stimuli. 
The skill is also not due to such factors 
as getting used to the situation, reduc- 
tion of tension, etc., since these should 
be the same for any kind of training, 
including constant training, when total 
practice, and therefore total time in the 
situation, is the same. It seems clear 
that the generalized skill that facilitated 
transfer performance in the present 
study was developed only from ex- 
perience with different sets of stimuli. 
Attempts to specify the basis of a 
skill which is developed from experience 
with stimuli have been made by other 
writers. Crafts’ (1) suggestion was that 
varied training compelled the develop- 
ment of habits of looking, searching, 
exploring, as habits antecedent to the 
final response. Kurtz (5) believed that 
S develops “observing responses.”” Reid 
(7) suggested a “response of discrimina- 
ting.”” Eckstrand and Wickens (3) sug- 
gested that Ss develop a “perceptual set,” 
that during varied training § becomes 
more sensitive to relevant than to 
irrelevant dimensions among stimuli. 
The writer’s view is very similar to 
those cited. Varied training seems to 
force S to pay close attention to every 
stimulus in every set. In time this 
response of concentrated attention may 
become habitual; § may learn, as a 
general, transferable principle, that it is 
of value to look carefully at each stimulus 
presented, not only to its obvious 
characteristics, but also to any minor 
details. If S does this , he should be 
able to discriminate easily among the’ 
stimuli within a list and between different 











72 CARL P. DUNCAN 


lists, thus minimizing both intralist and 
interlist interference. (Riopelle (8) has 
already shown that suppression of inter- 
task interference aids the development 
of learning sets.) In short, it is hypoth- 
esized that training with a variety of 
stimuli forces § to concentrate carefully 
on every stimulus, making use of all 
cues the stimulus provides, and that 
as a result stimuli soon became easily 
discriminable and enter more readily 
into S-R associations. 

From this hypothesis it would be 
predicted that amount (not necessarily 
rate) both of learning to learn in the 
simple sense of intertask improvement, 
and of the generalized ability to learn, 
should vary directly with degree of 
intralist, and perhaps interlist, similarity, 
and inversely with meaningfulness of 
stimuli. Evidence bearing on these 
predictions is not at present available. 


SUMMARY 


Transfer among perceptwal-motor paired- 
associates tasks was studied as a function of two 
variables: degree of variation in training, which 
was defined in terms of the number of different 
sets of training stimuli, and amount of training. 
Different groups of Ss were trained with 1, 2, 5, 
or 10 tasks (different sets of stimuli) for 2, 5, 
or 10 days (20 trials per day). Some other 
groups were trained for 2, 5, or 10 days with 
10 different re-pairings of the responses with a 
single set of stimuli. Following training, all Ss 
were tested for transfer to two new sets of stimuli. 

The results were: 

1. Among groups trained with different sets 
of stimuli, transfer increased as a direct function 
of degree of variation in training. In general, 
when total amount of training was equal, all 
degrees of varied training (2, 5, or 10 tasks) 
produced better transfer than constant training 
(1 task). 

2. There was no interaction between degree 
of variation in training and amount of training; 


although transfer increased, up to a limit, as 
total training increased, the transfer superiority 
of varied over constant training was not sig- 
nificantly affected by changes in amount of 
training. 

3. Groups trained by re-pairing the same 
stimuli with the responses exhibited, as did 
groups trained with different sets of stimuli, 
considerable intertask improvement during 
training, but showed no better transfer than 
constant training. 

It was suggested that these results may be 
best interpreted in terms of observational or 
perceptual processes. 


REFERENCES 


1. Crarts,L.W. Routine and varying practice 
as preparation for adjustment to new 
situations. Arch. Psychol., N. Y., 1927, 
14, No. 91. 

. Dasurety,’ J. F. An experimental isolation 
of higher level habits. J. exp. Psychol, 
1924, 7, 391-397. 

3. Ecxstranp, G. A., & Wickens, D. D. 
Transfer of perceptual set. /. exp. 
Psychol., 1954, 47, 274-278. 

4. Hartow, H. F. The formation of learning 

sets. Psychol. Rev., 1949, 56, 51-65. 

. Kurtz, K: H. Discrimination of complex 
stimuli: the relationship of training and 
test stimuli in transfer of discrimination. 
J. exp. Psychol., 1955, $0, 283-292. 

. McGeocn, J. A., & Irnton, A. L. The psy- 
chology of human learning. New York: 
Longmans, 1952. 

7. Rew, L. S. The development of noncon- 


~~ 


PA 


~ 


tinuity behavior through continuity 
learning. J. exp Psychol., 1953, 46, 
107-112. 


8. Riopette, A. J. Transfer suppression and 
learning sets. J. comp. physiol. Psychol, 
1953, 46, 61-64. 

9. Unperwoop, B. J. Experimental psychology. 
New York: Appleton-Century-Crofts, 
1949. 


(Received January 14, 1957) 








Journal 


Experimental Psychology 
Vol. $5, 


o. 1, 1958 


EFFECT OF A WARNING SIGNAL PRECEDING A 
NOXIOUS STIMULUS ON VERBAL RATE 
AND HEART RATE! 


FREDERICK H. KANFER? 


Purdue University 


In a study with laboratory animals 
Estes and Skinner (2) showed that a 
previously neutral stimulus after re- 
peated pairing with an electric shock 
comes to exercise a marked de- 
pressant effect on the rate of a well 
established bar-pressing response. 
The experimental operations were 
related to the concept of “anxiety” 
and the latter was defined as “an 
emotional state arising in response 
to some current stimulus which in 
the past has been followed by a 
disturbing stimulus (2, p. 400).” 
This definition of anxiety is accepted 
for use in the present study. 

In view of the clinical interest 
in the anxiety concept and the 
suggestion of Skinner (6) and Keller 
and Schoenfeld (4) that this depres- 
sive effect due to anticipation of a 
noxious stimulus may account for a 
variety of behavior changes encoun- 
tered in daily life and especially in 
psychopathology, it is desirable to 
test this paradigm directly on the 
human level. Such an investigation 
with human Ss is of additional per- 
tinence since several studies utilizing 
threat of shock have yielded data 
which suggest that this condition 
does not invariably show a disruptive 
effect on ongoing behavior but fre- 
quently facilitates the learning of a 
task for which S is motivated, even 
though the shock is unavoidable. 

The effect of “anxiety” has most 

1 This study was supported in part by re- 
search grant M-963 from the National Institute 
of Mental Health, U. S. Public Health Service. 


2 The study was completed while the author 
was on the faculty of Washington University. 


73 


often been related to two kinds of 
response changes. (a) Changes in 
various physiological measures such 
as GSR and heart rate have been 
used to estimate the presence and 
extent of an emotional state. Fur- 
thermore, these autonomic activities 
have also been conditioned to a host 
of previously neutral stimuli by use of 
shock as the unconditioned stimulus. 
(b) In clinical situations, estimates of 
“anxiety” are frequently made from 
the content of S’s verbal behavior. 
Skinner’s (6) analysis of verbal be- 
havior suggests that in its analysis 
one could utilize rate of responding 


as a measure of the effect of various 


experimental variables. This measure 
would eliminate much of the sub- 
jective element present in content 
analysis and prediction of such re- 
sponse changes could be facilitated 
from our knowledge of the behavior 
of the rate dimension as observed in 
animals. Keller and Schoenfeld, rep- 
resenting this viewpoint, imply that 
reduction of rate of verbal output 
may occur as a function of the emo- 
tional state (anxiety) induced by 
the conditioning of a warning signal 
to a noxious stimulus. 

The purpose of the present study 
is to investigate the effect of a signal 
preceding a noxious stimulus on the 
rate of continuous verbal responding 
in order to explore the generality 
of the properties which Skinner and 
Estes ascribe to anxiety and to 
correlate any change in verbal rate 
with possible simultaneous change 
in autonomic activity. 








74 FREDERICK 


METHOD 


Subjects.—The Ss were 78 undergraduate 
volunteers. They were equally divided into 
four groups, with the same proportion of males 
and females assigned to each group. Data from 
five Ss had to be discarded due to apparatus 
failure or S’s failure to follow instructions. 

Apparatus.—The stimuli were administered 
by means of chronoscopes according to a pre- 
determined schedule which was the same for all 
groups. A tone was delivered by a Hickok 198 
Signal Generator, set at 375 cps. Across each 
earphone (binaural, Type PDRS8) a voltage of 
.002 was obtained. This voltage produces a 
sound pressure level of 55 db. s.p.l. in such 
earphones. This level is approximately 25 db. 
above normal threshold ‘at 375 cps. Onset 
of the tone also displaced a time marker on the 
EKG record which returned when the tone 
ended. The shock circuit allowed delivery of 
approximately .9 to 1.3 milliamps D.C., inter- 
rupted 80 times per second. The actual cur- 
rent received varied with S’s resistance, although 
wide fluctuations were reduced by an internal 
resistance network so that a current change by a 
factor of 3 was obtained for a resistance change 
of a factor of 10. Heart rate was recorded by a 
Cardiotron Model PC2 using Lead I. A tape 
recorder was used to record each session. An 
audiosignal controlled by a 2-rpm motor was 
fed separately into the recorder. The E moni 
tored the tape recording by earphone and used 
the audiosignal to administer the stimuli. 

Experimental design.—The Ss were assigned 
to four groups. All groups were given 17 trials 
The first 12 trials comprised the acquisition 
period, the last 5 trials made up the extinction 
phase. The first 6 min. of responding were 
designed to permit stabilization of the verbal 
rate. Trial 1 started at time 6:00 and lasted for 
1 min. Intertrial intervals were randomly 
selected at 1, 1.5, 2, 2.5 and 3 min. to avoid 
temporal conditioning. The 
was then used for all groups. The total session 
lasted 52 min. An acquisition trial for Group 
T-S (tone-shock) consisted of a 1-min. tone, 
coterminal with a l-sec. shock. For Group 
C-S (control-shock) no tone was given, for 
Group C-T (control-tone) no shock was given 
and in Group C-N (control-nothing) neither 
tone nor shock occurred. For expository pur- 
poses the 30-sec. intervals preceding and fol- 
lowing the start of a trial will be called Periods 
A (pre-tone) and B (post-tone), respectively. 
The 30-sec. intervals preceding and following 
the end of a trial will be called Periods C (pre- 
shock) and D (post-shock), respectively. After 
Trial 12 the shock circuit was interrupted and 
the remaining 5 trials can be considered extinc- 


same schedule 





H. KANFER 


tion trials for Group T-S. Groups C-T and 
C-N continued as before, while Group C-S was 
not given any stimuli during this phase. Heart- 
rate samples were taken in all Ss for the last 15 
sec. of Periods A and C, and the initial 15 sec. of 
Periods B and D during Trials 1, 6, 12 and 17 
(the fifth extinction trial). For the control 
groups, data from the comparable time periods 
were obtained, although the tone, shock or both 
were absent in these groups. For each trial, 
Period A was used to represent pre-trial per- 
formance as a base for evaluating any changes 
during subsequent periods. 

Procedure.—The Ss were seated in a darkened, 
soundproof room before a curtain which hid a 
microphone. Shock electrodes were applied to 
the palmar side of the second and fourth finger 
of the right hand. The EKG electrodes were 
placed on both forearms and the right ankle. 
Electrode jelly was used for all contacts. A 
brief EKG recording was taken to adjust the 
\ll Ss were 
instructed to say separate words which came 
to mind, continuously until told to stop. They 
were asked not to use sentences, proper names or 
numbers. Groups T-S and C-S were given one 
sample shock and were told that other shocks 
would occur. Groups T-S and C-T were told 
that a tone would 
placed on all Ss. 


proper sensitivity for each S. 


occur. Earphones were 


RESULTS 


The word count for 
each 30-sec. interval was transcribed 
from the recorded tapes. In order 
to minimize unreliability, several E's 
practiced counting and evolved pro- 
cedural rules until they agreed with 
an error of less than 5 in a count of 
over a thousand words. The basic 
rules concerned the count of one 
unit as any pattern of sound which 
was uninterrupted, and the practice 
of including all units in the preceding 
time-interval if they started before 
the onset of the timing “beep.” 
Table 1 shows the total word count 
for all groups. Bartlett’s test yielded 
a chi-square value of 16.3 (P = .001) 
indicating different variabilities in the 
groups. ‘The greatest variability was 
found in Group C-N, suggesting that 
any experimental stimulation tends to 
reduce variability in verbal produc- 


Verbal rate. 








EFFECT OF A WARNING SIGNAL 


tivity in the group data. Since the 
assumption of homogeneity of vari- 
ance could not be made, the Kruskal 
and Wallis test was computed (8, 
p. 437). The resulting value of H 
was .574 (P > .05), indicating that 
the total verbal output did not differ 
among groups despite differential 
treatment. In order to ascertain 
whether similar verbal rates were 
maintained throughout the session, 
t tests for paired measures were per- 
formed on the sum of verbal output 
during the first and last 5 min. of 
intertrial intervals in each group. 
None of these t values reached sig- 
nificance at the .05 level. Hence, 
verbal rates can be assumed to be 
relatively stable for the duration 
of the experimental session. 

For analysis of the progress of 
acquisition all periods in Trials 1, 6, 
and 12 were analyzed, using a Treat- 
ment by Ss design. Then a summary 
analysis was computed to test the 
differences among groups and related 
interactions, using pooled subjects 
effects as error terms. ‘The assump- 
tion of homogeneity involved in these 
error terms was tested by Bartlett's 
test. The assumption was found to 
be tenable (P > .05) except in the 
case of the pooled Trial X S inter- 
action. However, although 
geneity of variance is questionable 
for this interaction, none of the four 
separate F ratios approached sig- 
nificance. This suggests that depar- 
ture from the homogeneity assumption 


homo- 


TABLE 1 


Torat Worp Count ror 52 Mix 
or RESPONDING 





Group Mean 
1111.47 
1061.45 
1285.33 


1166.12 | 405.44 


TABLE 2 


\NALYsIS OF VARIANCE OF VERBAL 
Rate on Triats 1, 6 anv 12 


Source df | MS 
| 149.27 | 
302.60 


Groups (G) 
Ss within G (Ss/G 
92. 


s 


194. 


did not distort the summary findings 
to any great extent. The summary 
analysis is presented in Table 2. The 
results indicate that neither Trial 
nor Period means differed significantly. 
The PX T XG interaction was 
found to be significant (P = .05). 
This finding can be interpreted to 
indicate that the groups differed with 
respect to changes which the period 
means underwent from the beginning 
to the end of training. The inter- 
action can be understood in terms of 
the analyses of the individual groups 
which indicated that a P X T inter- 
action existed only in Group T-S 
(F = MSpyr MSpyrys = 2.64, with 
6 and 96 df; P < .05). Thus, in 
Group T-S, rate changes occurred 
from period to period as a function of 
repeated tone-shock presentations. 
No such changes occurred in the 
other three groups, as indicated by 
non-significant F ratios for their 
P X T interactions. 

The changes in Group T-S are 
presented in Fig. 1. The greatest 
change in verbal behavior occurred 
during Period B. While tone onset 
led to a decrease on Trial 1, a mean 
increase of 1.56 words over the pre- 











76 FREDERICK H. KANFER 











IS} 
as 
‘N 
TRIAL ati: 5 \, TRIAL 6 
tu l2 ] / ‘\ on 
WY / o~ ‘ 
o / - ° 
~, AF of TRIAL 17 &_ -—~g 
ra 
OlOF 
s 
9 i 1 4 i 
A B . D 
PERIODS 
Fic. 1. Mean verbal rates for the 


experimental group (T-S). 


tone rate was observed in Trial 6. 
At the end of training this increase 
rose to 2.37 words. Period C showed 
a somewhat different pattern as 
training progressed. On Trial 1 the 
verbal rate in Period C was below the 
pre-tone rate. On Trial 6 the mean 
rate difference in Period C was 2.00 
words above the pre-tone rate. At 
the end of training, however, this 
difference dropped to .7 words. Thus, 
the greatest increase during Period 
C was found during the mid-point of 
training. The post-shock rate 
(Period D) showed an initial decrease 
over the pre-tone level. During the 
early trials the shock appeared to 
increase rate temporarily; on Trial 6 
the mean increase (D minus A) was 
1.34 words. At the end of training, 
however, this effect was no longer 
observed. 

The effect of tone-shock pairing on 
the distribution of rate within trials 
is thus reflected by a progressive 
increase of verbal rate during Period 
B, with an initial increase and subse- 
quent decline in Period C. A small 
increment in rate during Period D 


was observed early in training but 
was not seen at the end of acquisition. 
After 12 trials, the effect of the tone- 
shock sequence is an initial post-tone 
increase in rate, followed by a gradual 
decrease which returns the rate to 
its pre-tone level after shock-delivery. 

That the effect of tone-shock pres- 
entations is relatively unstable is 
indicated by the finding that the 
rate changes in Group T-S are no 
longer observed during the _ fifth 
“extinction” trial (Trial 17) and the 
rate appears to be no longer affected 
by the tone presented alone. 

Heart rate—The heart rate was 
obtained by measuring four heart 
cycles during all periods of Trials 
1,6, 12, and 17.4 The mean duration 
of the four cycles, as given by the 
distance which the EKG tape trav- 
elled, was converted into beats per 
minute and all data were treated in 
this unit. The four cycles chosen 
were those which immediately pre- 
ceded or followed the tone or shock. 
Because of illegibility of portions of 
the records, heart-rate measures from 
5 Ss had to be omitted, although 
these Ss were included in measures 
of verbal rate. Heart-rate data were 
treated statistically in the same 
manner as described for verbal rate. 
All pooled interaction terms were 
tested for homogeneity of variance 
by Bartlett’s test. The assumption 
of homogeneity was found to be 
tenable (P > .05), except for the 
PX TXS interaction. However, 
since all of the four separate F ratios 
(MSpy1/MSpxtxs) yielded values of 
less than unity, it is believed that a 
departure from the homogeneity as- 
sumption did not seriously alter the 
interpretation of these F ratios. The 


3On the basis of pilot work and earlier data 
(1, 5) four cycles were chosen in order to yield 
heart-rate values which would bracket the la- 
tency of the initial cardiac response to the tone. 











summary analysis is presented in 
Table 3. The trial variable yielded 
significant F ratios in the overall 
analysis and in the component esti- 
mates for each group indicating that 
heart rate showed a decline in all 
groups as training progressed. For 
Trial 1, the mean heart rate of all 
groups was 94.23, for Trial 6 it was 
89.98 and for Trial 12 it was 87.14. 
At the end of the session the mean 
was 85.62. 

The significant P K G interaction 
indicates that heart rate from period 
to period varied differentially in the 
groups. In the component analyses 
a significant F was found only in 
Group T-S (MSp/ MSpys = 4.92, 
with 3 and 39 df; P < .01). Thus, 
neither shock alone, nor tone alone 
resulted in variations in period means 
similar to those found in Group T-S. 
It is interesting to note that in Group 
C-S, despite a considerable increase 
in Period D on Trial 1, post-shock 
rates did not remain sufficiently high 
on subsequent trials to result in 
significant differences in period means. 
The tone in Group C-T had no 
noteworthy effect on heart rate. 


TABLE 3 


Awatysis oF Variance oF HEART 
Rate on Trius 1, 6 anv 12 





Source df | MS 

















| 

GAR.2 M 
Groups (G) | 3 | 297.24) <1 
Ss within G (Ss/G) 64 | 1478.52 | 
Trials (T) | 2 |3307.22| 34.32° 
TXG 6 64.88) <1 
T X Ss/G | 128 96.37 | 
Periods (P) | 3 Ry ~. 
PxXG 9 ».09 .67* 
PX Ss/G | 192 39.71 
PxTe | 6 | 3240] <1 
PXTXG | 18 6.89| <1 
PX T X Ss/G 384 | 37.78 

*P< Ol, 


EFFECT OF A WARNING SIGNAL 77 











- 
94 | 
= | 
= 
= 90} 
= 5 
E . i 
| ee 
i ——e 
82 P ‘ a . 
A B C D 
PERIODS 
Fic. 2. Mean heart rates for the 


experimental group (T-S). 


The changes in’heart rate in Group 
T-S are presented in Fig. 2. The 
tone led to a slight decrease on Trial 1. 
Repeated tone-shock presentations, 
however, resulted in post-tone heart 
acceleration. Period C showed a 
greater rate increase than Period B, 
suggesting the operation of temporal 
factors, such as in delayed condition- 
ing. Compared with the pre-tone 
rate, the effect of shock-delivery was 
found to be greater during the middle 
of training, decreasing somewhat on 
Trial 12. This decrease may be due 
to some adaptation to the shock by 
Trial 12. This suggestion is sup- 
ported by the above-mentioned de- 
crease of the shock effect in Group 
C-S. The continuing decrease of the 
basal heart rate during training must 
also be considered, since it may inter- 
act with the effect of shock in a way 
which modifies the post-shock rate 
increment. Furthermore it suggests 
a general lowering of tension as the 
session progressed. The overall dif- 
ference between periods in Group T-S 
is one in which there is a continued 
rise of heart rate from pre-tone to 
post-shock periods during training. 








78 FREDERICK H. KANFER 


The conditioned heart-acceleration is 
no longer observed on the _ fifth 
extinction trial (Trial 17). 


DIscUSSION 


Verbal rate-—Application ot the Estes 
and Skinner (2) anxiety paradigm to 
verbal behavior did not result in the 
expected depressant effect on the rate of 
continuous verbal responding. In fact, 
the tone-shock pairing increased the rate 
of such behavior. The discrepant find- 
ings raise several questions concerning 
the generality of the anxiety effects. In 
the Estes and Skinner study, the hunger- 
motivated bar-pressing response and the 
rat’s response to shock, such as freezing, 
jumping or standing on his hind legs, 
may be incompatible motor responses. 
The depressive effect would then repre- 
sent the behavioral consequences of the 
rat’s learning to carry out “anticipatory 
shock-avoidance” responses after onset 
of the signal. Such behavior, although 
it may be biologically effective in re- 
ducing the effects of shock, would inter- 
fere with bar-pressing and lower its 
frequency of occurrence. Given a situa- 
tion in which the response to the noxious 
stimulus and the learned operant re- 
sponse are not incompatible, one might 
expect facilitation of the operant response 
since components of the response to 
the noxious stimulus, now conditioned 
to the signal, may serve to increase the 
strength of the operant. 

In the present study responses con- 
ditioned to the tone-shock sequence are 
conceivably quite compatible with in- 
creased verbalizing. Past learning, 
based on the general admonition that 
“one should do something’”’ to avoid a 
variety of dangers, may determine a 
general response of increasing activity 
when a warning signal occurs. Such an 
increase may be irrelevant or even 
interfering in complex tasks, but facilitat- 
ing when the task is as simple as the 
present one. Facilitation of verbal re- 
sponding could also have been contingent 
on S’s self-instructions that the shock 


was intended as punishment for not 
responding. An increased rate of re- 
sponding would then be instrumental 
as a shock-avoidance response. 
Instances of increased activity and 
better performance of ‘“‘shock-threat”’ 
groups have been reported by Spence, 
Farber, and Taylor (7) and others. The 


influence of electric shock on verbal 
performance has been discussed by 
Farber (3), who suggests that the 


disruptive or facilitating effect of shock 
on verbal learning depends on the test 
situation. Furthermore, the general re- 
sponse of increased activity is not 
unfamiliar to the clinician who often 
utilizes such cues as increased motor 
activity and fast speech as indices of 
“clinical” anxiety. 

The effectiveness of a warning signa! 
preceding a noxious stimulus in changing 
ongoing behavior has been demonstrated 
by the present study and that of Estes 
and Skinner. The finding of a difference 
in the direction of rate change, however, 
requires that some qualifications be 
made to a general statement concerning 
anxiety effects on ongoing behavior. 
Since the extent and direction of anxiety 
effects on continuous behavior appears 
to be related to the following variables, a 
general description of the properties of 
anxiety may have to await their sys- 
tematic exploration. (a) The type of 
noxious stimulus and its intensity. Pre- 
viously learned or innate responses to 
the stimulus and their compatibility 
with concurrent responses may affect 
both the intensity and direction of the 
behavioral change. (4) Duration of the 
warning signal and frequency of pairing 
with the noxious stimulus or other noxiou 
stimuli. Temporal discrimination may 
reduce the duration of the anxiety effects. 
Prolonged presentation of the signal 
may lead to adaptation and subsequent 
decrease of its effectiveness. Such effects 
are suggested by the shift of response 
distribution in our data from an overall 
increase during the tone-on condition 
in Group T-S on Trial 6 to a greater 
but briefer increase during Period B on 











EFFECT OF A WARNING SIGNAL 79 


Trial 12. (c) The complexity of the 
continuous task and the degree of compat 
bility with the response to the noxiou 
stimulus. With human Ss, in addition 
to simple muscular 
self-reactions and 

generalizations (from 


antagonism, S's 

mediated 
experiences 
in relating the 
warning signal and noxious stimulus to 
the continuous task in numerous ways 


verbally 
past 
may play a major role 


No analysis of S’s verbal responses was 
made due to the practical problem of 
tabulating the 
given. 


large number of words 
However, observations indicated 
that content variables may. also reveal 
interesting effects of the | tone-shock 
sequence. For example, Ss in Group 
T-S showed a greater tendency to repeat 
words during the tone-on period trom 
trial to trial than Ss in the other groups. 
Similarly post-shock content in Groups 
T-S and C-S appeared to show 
stereotypy 
quiry 


such 
In a post-experimental in 
Ss could not verbalize the tone 
The most frequent 
hypothesis which they gave about the 
purpose of the experiment was that the 
shock or tone was contingent on some 
content variables (topics). 

Heart rate-—In 
studies on cardiac conditioning, the 
present Ss were not resting but engaged 
in continuous talking. Consequently 
their high basal heart rate and the con 
stant self-stimulation by their own verbal 


shock relationship. 


contrast to most 


conditions in the 
present study sufficiently to limit com- 
with other 
conditioning of 


responses changes 
During 
T-S_ the data 
reveal a temporal gradient of increasing 
heart rate from Period A to D i.e., the 
closer the shock, the more similar is 
the heart rate to the post-shock rate. 
The successtul conditioning of heart 
rate speaks tor the effectiveness of the 
experimental 


parisons findings. 


Group 


operations in producing 
those autonomic changes on which the 
definition of 


heavily. 


clinical anxiety 
This should lend further sup- 
port for study of the construct and its 
defining properties in controlled labora- 


tory settings. 


rests so 


SUMMARY 


The effects of a signal preceding an electric 
shock on the rate of continuous verbal respond- 
ing and on heart rate were investigated with 
college students. All Ss were asked to continue 
saying separate words throughout the 52-min. 
session. Experimental Ss were given acquisition 
training comprising 12 trials during which a 
l-min. tone was coterminal with a l-sec. shock. 
Five extinction trials (shock omitted) were 
then given. Three control groups were given 
(a) only the signal, (b) only the shock, and 
(c) neither shock nor signal, with the same 
frequency and according to the same schedule 
as the experimental group. 

The results indicated that the total verbal 
output did not differ for the groups. The 
training trials showed no effect on the distribu- 
tion of verbal responses for the control groups. 
In the experimental group repeated tone-shock 
pairings resulted in a marked increase of verbal 
rate following onset of the tone. While this 
increase lasted the entire tone duration during 
the middle of training (Trial 6), it was not 
maintained beyond a 30-sec. post-tone period 
at the end of training. On the fifth extinction 
trial, verbal rate no longer changed as a function 
of tone presentation in the experimental group. 

In all groups heart rate showed a decline as 
training progressed. Significant heart rate 
changes within trials, however, were obtained 
in the experimental group only. In this group 
a progressive heart acceleration was observed 
from pre-tone to post-tone to pre-shock to post- 
shock samples. On the fifth extinction trial 
heart rate changes no longer occurred on presen- 
tation of the tone to experimental Ss. 

The findings were discussed in relation to 
Estes and Skinner’s experimental definition of 
anxiety. It was suggested that the general 
effect of these operations (anxiety) do not 
always show depression of the rate of continuous 


behavior. Several variables were suggested 
which may determine the conditions under 
which increase or depression of continuous 


behavior results. 


REFERENCES 


1. Davis, R. C., Bucuwatp, A. M., & Franx- 
MANN, R. W. Autonomic and muscular 
responses, and their relation to simple 
stimuli. Psychol. Momogr., 1955, 69, 
No. 20 (Whole No. 405). 

2. Estes, W. K., & Skinner, B. F. Some 
quantitative properties of anxiety. /. 


exp. Psychol., 1941, 29, 390-400. 








78 FREDERICK H. KANFER 


The conditioned heart-acceleration is 
no longer observed on the _ fifth 
extinction trial (Trial 17). 


Discussion 


Verbal rate-—Application ot the Estes 
and Skinner (2) anxiety paradigm to 
verbal behavior did not result in the 
expected depressant effect on the rate of 
continuous verbal responding. In fact, 
the tone-shock pairing increased the rate 
of such behavior. The discrepant find- 
ings raise several questions concerning 
the generality of the anxiety effects. In 
the Estes and Skinner study, the hunger- 
motivated bar-pressing response and the 
rat’s response to shock, such as freezing, 
jumping or standing on his hind legs, 
may be incompatible motor responses. 
The depressive effect would then repre- 
sent the behavioral consequences of the 
rat’s learning to carry out “‘anticipatory 
shock-avoidance” responses after onset 
of the signal. Such behavior, although 
it may be biologically effective in re- 
ducing the effects of shock, would inter- 
fere with bar-pressing and lower its 
frequency of occurrence. Given a situa- 
tion in which the response to the noxious 
stimulus and the learned operant re- 
sponse are not incompatible, one might 
expect facilitation cf the operant response 
since components of the response to 
the noxious stimulus, now conditioned 
to the signal, may serve to increase the 
strength of the operant. 

In the present study responses con- 
ditioned to the tone-shock sequence are 
conceivably quite compatible with in- 
creased vverbalizing. Past learning, 
based on the general admonition that 
“one should do something’’ to avoid a 
variety of dangers, may determine a 
general response of increasing activity 
when a warning signal occurs. Such an 
increase may be irrelevant or even 
interfering in complex tasks, but facilitat- 
ing when the task is as simple as the 
present one. Facilitation of verbal re- 
sponding could also have been contingent 
on S’s self-instructions that the shock 


was intended as punishment for not 
responding. An increased rate of re- 
sponding would then be instrumental 
as a shock-avoidance response. 

Instances of increased activity and 
better performance of “shock-threat” 
groups have been reported by Spence, 
Farber, and Taylor (7) and others. The 
influence of electric shock on verbal 
performance has been discussed by 
Farber (3), who suggests that the 
disruptive or facilitating effect of shock 
on verbal learning depends on the test 
situation. Furthermore, the general re- 
sponse of increased activity is not 
unfamiliar to the clinician who often 
utilizes such cues as increased motor 
activity and fast speech as indices of 
“clinical” anxiety. 

The effectiveness of a warning signal 
preceding a noxious stimulus in changing 
ongoing behavior has been demonstrated 
by the present study and that of Estes 
and Skinner. The finding of a difference 
in the direction of rate change, however, 
requires that some qualifications be 
made to a general statement concerning 
anxiety effects on ongoing behavior. 
Since the extent and direction of anxiety 
effects on continuous behavior appears 
to be related to the following variables, a 
general description of the properties of 
anxiety may have to await their sys- 
tematic exploration. (a) The type of 
noxious stimulus and its intensity. Pre- 
viously learned or innate responses to 
the stimulus and their compatibility 
with concurrent responses may affect 
both the intensity and direction of the 
behavioral change. (4) Duration of the 
warning signal and frequency of pairing 
with the noxious stimulus or other noxious 
stimuli. Temporal discrimination may 
reduce the duration of the anxiety effects. 
Prolonged presentation of the signal 
may lead to adaptation and subsequent 
decrease of its effectiveness. Such effects 
are suggested by the shift of response 
distribution in our data from an overall 
increase during the tone-on condition 
in Group T-S on Trial 6 to a greater 
but briefer increase during Period B on 








EFFECT OF A WARNING SIGNAL 


Trial 12. (c) The complexity of the 
continuous task and the degree of compati 
bility with the response to the noxtou 
stimulus. With human Ss, in addition 
to simple muscular 
selt-reactions 
generalizations 


antagonism, ’s 
and verbally mediated 
from past experiences 
may play a major role in relating the 
warning signal and noxious stimulus to 
the continuous task in numerous ways. 

No analysis of §’s verbal responses was 
made due to the practical problem ot 
tabulating the large 
given. 


number of words 
However, observations indicated 


that content variables may also reveal 
tone-shock 
kor example, Ss in Group 
T-S showed a greater tendency to repeat 


interesting effects of the 


sequence. 


words during the tone-on period trom 
trial to trial than Js in the other groups. 
Similarly post-shock content 
T-S and C-S appeared to 
stereotypy. In a post-experimental in 
Ss could not verbalize the 
shock relationship 


in Groups 
show such 
quiry tone 
The most frequent 
hypothesis which they gave about the 
purpose of the experiment was that the 
shock or tone was contingent on some 
content variables (topics). 

Heart rate.—In 
studies on 


contrast to 
cardiac conditioning, the 
present ‘Ss were not resting but engaged 
in continuous talking. Consequently 
their high basal heart rate and the con 
stant self-stimulation by their own verbal 


most 


responses changes conditions in the 
present study sufficiently to limit com- 
During 
T-S the data 
reveal a temporal gradient ot increasing 
heart rate from Period A to D 1.e., the 
shock, the more similar is 


the heart rate to the post-shock rate. 


parisons with other findings. 


conditioning of Group 


closer the 


The successful conditioning of heart 
rate speaks for the effectiveness of the 
experimental operations in producing 
those autonomic changes on which the 
definition of 


heavily. 


clinical anxiety rests so 
This should lend further sup- 
port for study of the construct and its 
defining properties in controlled labora- 


tory settings. 


SUMMARY 


The effects of a signal preceding an electric 
shock on the rate of continuous verbal respond- 
ing and on heart rate were investigated with 
college students. All Ss were asked to continue 
saying separate words throughout the 52-min. 
session. Experimental Ss were given acquisition 
training comprising 12 trials during which a 
l-min. tone was coterminal with a l-sec. shock. 
Five extinction trials (shock omitted) were 
then given. Three control groups were given 
(a) only the signal, (6) only the shock, and 
(c) neither shock nor signal, with the same 
frequency and according to the same schedule 
as the experimental group. 

The results indicated that the total verbal 
output did not differ for the groups. The 
training trials showed no effect on the distribu- 
tion of verbal responses for the control groups. 
In the experimental group repeated tone-shock 
pairings resulted in a marked increase of verbal 
rate following onset of the tone. While this 
increase lasted the entire tone duration during 
the middle of training (Trial 6), it was not 
maintained beyond a 30-sec. post-tone period 
at the end of training. +> 
trial, verbal rate no longer 


the fifth extinction 
nanged as a function 
of tone presentation in the experimental group. 

In all groups heart rate showed a decline as 
training progressed. Significant heart rate 
changes within trials, however, were obtained 
in the experimental group only. In this group 
a progressive heart acceleration was observed 
from pre-tone to post-tone to pre-shock to post- 
shock samples. On the fifth extinction trial 
heart rate changes no longer occurred on presen- 
tation of the tone to experimental Ss. 

The findings were discussed in relation to 
Estes and Skinner’s experimental definition of 
anxiety. It was suggested that the general 
effect of these operations (anxiety) do not 
always show depression of the rate of continuous 
behavior. Several variables were suggested 
which may determine the conditions under 
which increase or depression of continuous 
behavior results. 


REFERENCES 


Davis, R. C., Bucuwatp, A. M., & Franx- 
MANN, R. W. Autonomic and muscular 
responses, and their relation to simple 
stimuli. Psychol. Monogr., 1955, 69, 
No. 20 (Whole No. 405). 

. Estes, W. K., & Sxrnner, B. F. Some 
quantitative properties of anxiety. /. 


exp. Psychol., 1941, 29, 390-400. 








80 FREDERICK H. KANFER 


3. Farper, I. E. The role of motivation in 
verbal learning and performance. Psy- 
chol. Bull., 1955, $2, 311-327. 

4. Kevver, F. S., & Scooenretp, W. N. Prin- 
ciples of psychology. New York: Apple- 
ton-Century, 1950. 

5. McAuuster, W. R., Farper, I. E., & 
Taytor, J. E. Conditioned heart rate 
as a function of anxiety and CS-UCS 
interval. ONR Tech. Rep. No. 2, 1954, 


Project NR 154-107, State Univer. of. 


lowa. 


6. Sxinner, B. F. Science and human behavior. 
New York: Macmillan, 1953. 

7. Spence, K. W., Farper, I. E., & Taytor, E. 
The relation of electric shock and anxiety 
to level of performance in eyelid condi- 
tioning. J. exp. Psychol., 1954, 48, 
404-408. 

8. Watkxer, H. M., & Lev, J. Statistical 

New York: Holt, 1953. 


inference. 


(Received January 18, 1957) 














Journal 


of Experimental Psycholog 
Vol. 55, 47 1, 1958 . 


UTILITY OF GRADES: 
LEVEL OF ASPIRATION IN A DECISION 
THEORY CONTEXT! 


SELWYN W. BECKER 


The Pennsyloania State University 
AND SIDNEY SIEGEL? 
Center for Advanced Study in the Behavioral Sciences 


When a person is presented with 
a task whose outcome can be meas- 
ured on an achievement scale, he may 
strive for a particular goal or level 
of achievement on that scale. The 
term level of ‘aspiration refers to this 
goal for which he strives. Siegel (10) 
has suggested that a level of aspira- 
tion situation may be viewed as a 
decision situation, has shown that 
certain variables the Lewinian 
theoretical approac.. to level of aspira- 
tion (6) may be equivalent to certain 
variables in decision theory (2, 3, 
7,11), and has shown that the Lewin- 
ian formulation may be synthesized 
with utility theory. An achievement 
scale may be thought of as a utility 
scale, on which each achievement 
goal has utility (subjective value) 
for the person, and level of aspiration 
may thus be defined as a point on 
that utility scale: The level of aspira- 
tion of an individual is a point in the 
positive region of his utility scale of an 
achievement variable; it is at the least 
upper bound of that chord (connecting 
two goals) which has maximum slope, 
1.¢., the level of aspiration is associated 
with the higher of the two goals between 


' Support of this study was provided in part 
by grants to S.S. from the National Institute 
of Mental Health (Grant M-1328) and from 
the Council on Research of The Pennsylvania 
State University, and in part by a Public Health 
Service Research Fellowship to S. W. B. from 
the National Institute of Mental Health. 

20On leave from The Pennsylvania State 
University, 1957-1958. 


81 


which the rate of change of the utility 
function is a maximum (10). 

When the goals on the achievement 
scale are discrete, ordered metric 
measurement (8) of the individual’s 
utility of those goals is sufficient to 
identify his level. of aspiration, for 
an ordered metric scale not only 
ranks the goals in their order of 
preference but also ranks the dis- 
tances (differences in preference) be- 
tween the goals. With such a scale, 
therefore, it is possible directly to 
identify. that goal which has the 
largest difference in utility (i.e., 
maximum rate of charze of utility) 
between it and the next lower goal, 
and by the definition cited above, 
this goal represents the person’s 
level of aspiration. 

The purpose of the present paper 
is to report experimental evidence 
which supports this definition of 
level of aspiration by confirming two 
hypotheses drawn from it. 


MeTHOoD 


The instructor in an introductory statistics 
course announced that he would give any 
student in the class an opportunity to gamble 
for his midterm grade in lieu of basing the 
student's grade on his performance on the mid- 
term examination. Of the 50 students in the 
class, 23 volunteered to gamble for their grade 
rather than take the examination. 

These volunteers met in a group at an evening 
meeting. Each S was given a booklet containing 
a series of alternative gambles, and he was 
instructed to record his choice among each in 
the series. By using test booklets it is possible 
to obtain individual ordered metric scales 











82 SELWYN W. BECKER . 


simultaneously from a number of Ss. The 
alternatives in the booklets were of this sort: 


Would you prefer a 50-50 chance of 
(Cj Alternative 1 


an Aor 
an F 


or would you prefer 
C] Alternative 2 


a Bor 
aD 


The Ss were told that after each had com 
pleted his test booklet, one of the pages of the 
booklet was to be selected at random by use of 
a table of random numbers, and the gamble 
each S had chosen on that page would be the 
one on the basis of which his midterm grade 
would be determined. For the gamble, a zero- 
association nonsense syllable die was to be cast. 
This die was passed among Ss for examination. 
The purpose of using it was to keep each S’s 
subjective probability toward the alternative 
outcomes at .50 (2, 8). It was stressed that 
Ss should make each of their choices carefully, 
because any one could be the crucial one on 
which their grade would depend. 

The booklets contained 15 offers, based on 
the procedure (8) for obtaining higher ordered 
metric measurement. For an individual whose 
choices are consistent and transitive, this pro- 
cedure results in an ordering of the distances 
(differences in preference) and combinations of 
distances between grades. Only certain of S’s 
choices need be observed to determine his ordered 
metric scale. His choices on the remaining 
offers serve as checks on the consistency and 
transitivity of his scale. 

When each S had completed his test booklet, 
the critical page was chosen, the gamble made, 
and, by a ruse,’ its outcome was predetermined : 


3 After each S had completed his own test 
booklet, the Ss as a group completed a booklet 
representing their collective decisions. This 
procedure was in connection with a group 
decision study which will be reported elsewhere. 
After the group decision booklet had been filled 
out, Ss were led to decide to use this latter 
booklet in the selection of a “critical” page for 
the gamble for the determination of grades. 
This decision decreed that every S in the group 
would receive the same grade. Ostensibly, 
the selection of the page containing the “critical” 
offer was made on the basis of a number chosen 
at random from a table of random numbers. 
Actually, unknown to Ss, the Es exercised con- 
trol over which number was chosen, predetermin- 


AND SIDNEY SIEGEL 


Some responded to this outcome with con- 
siderable dismay and disappointment. This led 
to the introduction of a second ruse. The Ss 
were told that those who were dissatisfied with 
the C they had won could obtain an individual 
interview in which “something could be arranged 
(perhaps in terms of extra work) so that your 
grade might be raised.” They were told that 
such interviews would have to take place during 
the remainder of the same evening or not at all, 
but that the interviewing could not begin im- 
mediately because the instructor had to place 
an urgent long distance telephone call for which 
he would have to leave the building for 5 or 10 
min. The Ss who wished to obtain an interview 
were instructed to remain and await his return. 
The Es left and unexplainedly stayed away for 
about 50 min., during which Ss had to wait 
idly. This second ruse permitted a test of one 
of the hypotheses, as will be explained. 

After the Es returned, individual interviews 
were conducted with every S who had waited. 
The course instructor interviewed S while the 
other E sat out of sight of S, recording S’s 
responses to the questions. At the time of the 
interviews, neither £ had examined any S’s 
test booklet and thus neither had any informa- 
tion concerning any S’s utility of grades. 

The questions asked by the interviewer were 
constructed to gain information for an independ 
ent assessment of S’s level of aspiration for his 
midterm grade. The interview opened with 
“Why are you dissatisfied with the C?”  Fol- 
lowing S’s answer to this, the following questions 
were asked: “What grade would you like to get 
on this midterm?”; “What grade do you think 
you'd get if you had a chance to take the mid- 
term examination?”’; “What is the lowest grade 
you could get and still not feel too badly about 
it? (You might not feel really good about it, 
but still you would be satisfied.)” If the 
response to this question indicated that S 
would not be satisfied unless his grade were 
above C, he was asked, “In order to get a B 
(A), how many hours of work would you do for 
me on some routine filing and clerical work I 
have?” To conclude the interview, S was 
offered this arrangement: “Suppose we make a 
deal like this: I'll let you take the midterm 
examination, and if you get an A or a B on it 
that grade will be recorded for you, but if you 
get a D or F you will have to.take that too. Do 
you want to do that, or would you rather settle 
for the C you already have and forget about 
the exam?” 

From each S’s choices in his test booklet, 
an ordered metric scale representing his utility 


ing the offer on which the grade would be based. 
each S won a C for his midterm grade. 














UTILITY OF GRADES 83 


TABLE 1 


SampcLe Protoco. 


Alternatives* | Choice 
1 AorC vs. B or B | AC 
Bor D vs.C or C 
CorF vs. DorD | CF 
Bor F vs.C or D 
Aor F vs. B or D 
Aor D vs. B or 
Aor F vs. Bor B 
B or F vs. C or ( 
” Nor F vs. B or ( AF 
B 





- 
= 
A 
= 
o 
ed 
= 
A 
o 
7 


> 


10 | Aor Dv: Bor AD < BB | AB BD 
in| Aor F va. C or D \F F 
12 A or F va. C or ¢ AF < CO AC CF 


13 Aor F vs. Dor D AF 

14 Bor F vs.DorD | BF 

is Aor D va. C or C AD > cc AC 
| 


* The alternative chosen by S is italicized. 


of grades was constructed. Table 1 presents 


a sample protocol. It shows the 15 offers 
given one S, the alternative he selected (in italics 
from each offer, and the difference in preference 
which can be inferred from that choice. For 
example, on the first offer, S chose to win a B 
for sure rather than to gamble on a 50-50 chance 
of winning either an A or a C. This choice 
is recorded as AC < BB (read: a 50-50 chance 
of getting either A or C is less preferred than a 
50-50 chance of getting either B or B). 
From this choice, it can be shown for this S 
that the difference in utility between an A and 
a B is less than the difference between a B 
and aC. This is recorded as AB < BC (read: 
the distance between A and B is less than 
the distance between B and C). The method 
of translating choices to distance relations, and 
the rationale and proof underlying it, may be 
found elsewhere (8). 

From Table | it may be seen that for this S 
only four choices (the first, second, third, and 
tifth) need be observed to determine his ordered 
metric BC > CD > DF > AB. His 
choices on the fourth and sixth offers serve as 
checks on the consistency and transitivity of 
his scale. Figure 1 shows the ordered metric 
scale of utility of grades for S whose protocol 
is given in Table 1. 


scale: 


\ B Cc D F 
Fic. 1. Example of an ordered 
metric scale: BC > CD > DF > AB. 


The seventh and succeeding choices yield 
information necessary to the construction of a 


higher ordered metric scale of utility (8). This 
information was not necessary for the present 
study, but was collected as data for a group 
decision study which will be reported in a sub- 
sequent paper. 

From each scale, the grade representing S’s 
level of aspiration, as defined, could be read 
directly. For example, S whose scale is repre- 
sented in Fig. 1 had B as his level of aspiration: 
B is the goal which has the largest distance 
between it and the next lower goal. 

From the verbatim account of each interview, 
four sorts of information were abstracted for 
each S: (a) his desired grade, (b) his expected 
grade, (c) the lowest grade which would be 
satisfactory to him, and (d) the number of 
hours he was willing to work at clerical tasks 
in order to effect a one-level raise in grade. 

Using this information, the two Es inde- 
pendently estimated each S’s level of aspiration. 
They did this prior to examining any S’s test 
booklet. In assessing the four sorts of informa- 
tion, they first classified each S on the basis 
if (c) above. This enabled them to assign 
each S to one of three categories (A, B, or C), 
indicating the lowest grade which would be 
satisfactory to each—none of the interviewees 
admitted to being satisfied with a D. The 
other three sorts of information were used to 
rank all Ss within the three main categories. 
Thus each E ranked all the interviewed Ss 
from highest to lowest in level of aspiration for 
midterm grade as revealed in the interview. 
The two Es’ sets of rankings were in very close 
agreement: by the Spearman rank correlation 
coeficient (9, pp. 202-213), their inter-scorer 
reliability is r,= .99. 


HyPoTHESES 


The procedures of this experiment‘ 
were designed to permit the test of two 
hypotheses, each of which follows from 
the assertion (10) that on an ordered 
metric scale of utility of goals on an 
achievement scale, the level of aspiration 
is represented by the goal which has 
the largest distance between it and the 
next lower goal. The two hypothesis- 
tests assess the validity of that definition. 


Following the data-collection, individual 
discussions were held with all 23 Ss. The out- 
come of these discussions was that all Ss agreed 
to take the examination. The purpose, tech- 
niques, and results of the experiment were 
explained to the entire class. Every student 
understood that no course grade would be 
influenced by the experiment. 








84 SELWYN W. BECKER AND SIDNEY SIEGEL 


Hypothesis I—If S aspires for less 
than a C on a test, he should be fully 
satisfied with a C when it is awarded 
to him. In the context of this experi- 
ment, such an S would have no reason 
to bother to wait for an interview, 
especially when the interviews were 
held late at night ‘(from 9:00 P.M. 
until after midnight) and moreover were 
unexpectedly delayed by the unexplained 
50-min. detainment of the Es. This 
reasoning underlies the first hypothesis, 
viz.: Those Ss who do not wait for an 
interview will be Ss on whose ordered 
metric scales of utility of grades the 
largest distance is between D and F. 

Hypothesis II—As we have men- 
tioned, the interview was used to obtain 
an independent assessment of each S’s 
level of aspiration, with which his level 
of aspiration shown by his scale of 
utility of grades could be compared. 
The second hypothesis concerned this 
comparison, viz.: There will be a positive 
correlation between Ss’ levels of aspira- 
tion as expressed in the interviews and 
their levels of aspiration as given by 
their scales of utility of grades (on which 
the grade at the upper bound of the 
largest distance is taken as level of 
aspiration). 


RESULTS 


Analysis of the responses in Ss’ 
test booklets by the method reported 
elsewhere (8) yielded consistent and 
transitive ordered metric scales for 
20 of the 23 Ss. The other Ss made 
inconsistent and/or intransitive 
choices and therefore their scales 
could not be included in the data 
for the hypothesis-tests. 

Hypothesis I was supported by the 
data. Four Ss left the room before 
the Es returned from their long errand 
to conduct the individual interviews. 
Every one of these was found to have 
an ordered metric scale of utility 
of grades on which the largest dis- 
tance was between D and F. 

Hypothesis II was confirmed by 
statistical analysis. A Spearman 


rank correlation coefficient was com- 
puted between the ranks of Ss’ levels 
of aspiration as revealed by - their 
ordered metric scales and the ranks 
of their levels of aspiration as re- 
vealed in the interviews. The cor- 
relation between these two_ inde- 
pendent indices of level of aspiration 
was r, = .84. Corrected for ties (9, 
pp. 206-210), r, = .83, P < .001. 


Discussion 


The evidence adduced in support of 
Hypothesis I is direct behavioral con- 
firmation of the validity of the definition 
of level of aspiration as a point on the 
person’s scale of utility of goals on an 
achievement scale. The Ss who chose 
not to remain for an interview (which 
had as its only stated purpose the 
exploration of possibilities for raising the 
interviewee’s grade from a C) gave 
behavioral evidence, in leaving the room, 
that they were not motivated to achieve 
more than a C. In every case, the 
ordered metric scales of these Ss showed 
this to be the case. 


The reader may wonder why we did not 
predict that those Ss would leave whose ordered 
metric scales showed a C to be their level of 
aspiration. This prediction could not be made 
because a continuum underlies the discrete 
entities (Grades A, B, C, etc.) on the achieve- 
ment scale. With an ordered metric scale, we 
may only specify the grade which is closest to 
his level of aspiration on that continuum. If 
the largest distance on the ordered metric 
scale is below the C, we can only say that S’s 
level of aspiration is closer to a C than to any 
other letter grade. We cannot say whether 
it is slightly above or slightly below C. 


The data which support Hypothesis 
II also lend confirmation to the defini- 
tion, if the interview index of level of 
aspiration is valid. That it may be is 
suggested by the fact that other studies 
(1, 4, 5) have reported satisfactory 
results with the use of verbal expression 
of expectancy and desire as an index of 
level of aspiration. The present Es 
felt that a reasonably valid ranking of 
Ss with respect to their levels of aspira- 


UTILITY OF GRADES 85 


tion could be made on the basis of the 
interview information. 


SUMMARY 


Evidence is presented to support the asser- 
tion, detailed elsewhere (10), that a person's 
level of aspiration is associated with the least 
upper bound of the largest distance on his 
ordered metric scale of utility of various goals 
on an achievement scale. The goals studied 
were various possible midterm grades in an 
introductory statistics course, and the student- 
volunteers’ ordered metric scales of these were 
determined in the course of their gambling 
with their instructor for a grade. 


REFERENCES 

. Cuirp, L. L., & Wartine, J. W. M.  Deter- 
minants of level of aspiration: evidence 
from everyday life. J. abnorm. soc. 
Psychol., 1949, 44, 303-314. 

. Davipson, D., Suppes, P., & Siecen, S. 
Decision-making: an experimental ap- 
proach. Stanford, CCalif.: Stanford 
Univer. Press, 1957. Chap. 2. 

3. Epwarps, W. The theory of decision- 
making. Psychol. Bull., 1954, 51, 380- 
417. 


. Frank, J. D. Recent studies of the level 


of aspiration. Psychol. Bull., 1941, 38, 
218-226. 


. Gepuarp, M. E. Changes in the attrac- 


tiveness of activities: the effect of 
expectation preceding performance. /. 
exp. Psychol., 1949, 39, 404-413. 


. Lewrn, K., Demso, T., Festincer, L., & 


Sears, P. S. Level of aspiration. In 
J. McV. Hunt (Ed.) Personality and the 
behavior disorders. Vol. 1. New York: 
Ronald, 1944. Pp. 333-378. 


. Turatt, R. M., Coomas, C. H., & Davis, 


R. L. (Eds.) Decision processes. New 
York: Wiley, 1954. 


. Siecer, S. A method for obtaining an 


ordered metric scale. Psychometrika, 
1956, 21, 207-216. 


. Stecer, S. Nonparametric statistics for the 


behavioral sciences. New York: McGraw- 
Hill, 1956. 


. Stecet, S. Level of aspiration and decision 


making. Psych. Rev., 1957, 64, 253-262. 


. von Neumann, J., & Morcenstern, O. 


Theory of games and economic behavior. 
(2nd Ed.) Princeton: Princeton Univer. 
Press, 1947. 


(Received January 24, 1957) 








Journal of Experimental Psychology 
Vol. 55, No. 1, 1958 


THE EMPIRICAL VALIDITY OF EQUAL 
DISCRIMINABILITY SCALING! 


EARL A. ALLUISI AND RAYMOND C. SIDORSKY ? 


Laboratory of Aviation Psychology, The Ohio State University 


Garner and Hake (8) have sug- 
gested that information measures 
(9, 14, 16) be used in specifying S’s 
ability to make absolute judgments. 
They have suggested also that scales 
of equal discriminability (ED scales) 
be used in specifying how S makes 
those judgments, and they describe 
a technique for constructing such 
scales. These two suggestions repre- 
sent complementary ways of treating 
the same set of absolute-judgment 
data. 

Theoretically, the results of an 
information analysis should answer 
the question of how many stimulus 
categories can be absolutely discrimi- 
nated. The results of an ED scaling 
should answer the question of which 
specific stimulus categories can be 
discriminated equally well in absolute- 
judgment situations. 

Although these techniques are im- 
portant for research and theory (cf. 
10), and also for certain applications 
to design problems (cf. 11), their 
validity has not been firmly estab- 
lished. The major purpose of this 

1 This research was supported in part by the 
U. S. Air Force under Contract No. AF 33 
(616)-43 and Contract No. AF 33(616)-3612, 
Project No. 7192, with The Ohio State Uni- 
versity Research Foundation, monitored by the 
Aero Medical Laboratory. Permission is grant- 
ed for reproduction, translation, publication, 
use, and disposal in whole or in part by or for 
the United States Government. The authors 
wish to acknowledge the assistance of Mr. 
Hugh B. Martin and Mrs. Ilse B. Webb in the 
collection and analysis of the data, and the many 
helpful contributions of Drs. P. M. Fitts, W. R. 
Garner, G. A. Miller, I. Pollack, and R. W. 
Queal, Jr. | 

2 Now at the Army Medical Research Labora- 
tory, Fort Knox, Kentucky. 


86 


study is to determine the degree 
of validity that may be expected 
with the ED-scaling technique. 

Specifically, the major proposal 
tested in this study was the prediction 
that equal distances on an ED scale 
represent equal extents of discrimin- 
ability, and that stimuli represent 
equal tendencies to be erroneously 
identified when they have been 
selected to be spaced at equal intervals 
on the ED scale. Two experiments 
were conducted. In both experi- 
ments, Ss made absolute judgments 
of the sizes of small circles of light. 
The two experiments will be presented 
separately. 


’ 
EXPERIMENT | 


This experiment was designed to 
measure the effects of (a) knowledge 
of results, (b) range of stimulation, 
(c) spacing between adjacent stimu- 
lus categories, and (d) number of 
stimulus and response categories, upon 
both information transmission and 
ED scaling. 

The experiment was conducted in 
four parts. In Part 1, 15 stimulus 
categories (small circles of light) were 
used and knowledge of results was 
provided S after each response. In 
Part 2, the same stimuli were used, 
but no knowledge of results was 
provided. Eight stimulus categories 
were used without knowledge of 
results in Parts 3 and 4, but the 
range of stimulus variation and the 
spacing between adjacent stimulus 
categories in Part 4 were both smaller ° 
than in Part 3. 





VALIDITY OF EQUAL DISCRIMINABILITY SCALING 87 


Method 


Stimuli.—The stimuli in Parts 1 and 2 were 
15 circles of light that ranged in diameter from 
3/64 to 17/64 in., in steps of 1/64 in. Eight 
circles of light covering the same range in steps 
of 1/32 in. were used as stimuli in Part 3. The 
eight circles of light used in Part 4 ranged in 
steps of 1/64 in. from 7/64 to 14/64 in. in 
diameter. 

Apparatus.—The experiment was conducted 
in a booth uniformly painted a flat gunmetal 
gray. Stimuli were displayed singly on a 10-in. 
diameter circular display area of opal glass. 
The display surface was tilted 30° from vertical 
away from S, and its center was raised 30 in. 
above the floor level of the experimental booth. 

Stimuli were formed by projecting light onto 
the opal glass screen through holes drilled in a 
disc of 1/64-in. thick opaque black Plexiglas. 
The source for the stimulus light was a 6-v. 
Tung-Sol bulb located 3 in. bebind the disc in 
a light-proof tube. The disc was positioned 
directly behind the opal glass and mounted 
so that E could rotate it to bring any of the 15 
stimulus-producing holes into displaying position 
1 in. above and 1 in. to the right of the center 
of the display surface. An opaque mask covered 
the 14 holes in the nondisplaying positions at all 
times. 

A series of perforations forming a code num- 
ber for each symbol was drilled into the Plexi- 
glas disc. These numerals appeared 1 in. 
below the stimulus whenever a second light was 
activated, and permitted E to provide immediate 
knowledge of results in Part 1. 

On S’s side of the display, the only illumina- 
tion was provided by a diffuse spot of light, 
approximately 3 ft. in diameter, concentric with 
the display area. This gave a brightness of 
.2 ft.-L. as measured with a MacBeth Illuminom- 
eter normal to, and at the position of, the display 
surface. 

Subjects.—Three men and two women, rang- 
ing in age between 19 and 21 yr., served as Ss. 
All possessed normal or corrected 20/20 visual 
acuity. They were employed in the laboratory 
and were experienced in making psychophysical 
judgments of the sort required in this study. 

Procedure.—One S at a time was seated in 
the experimental booth facing the display sur- 
face so as to view the stimuli binocularly from a 
distance of approximately 28 in. He was in- 
structed to call out a number corresponding to 
the size of each circle presented. The number 
1 represented the smallest circle, and the number 
15 represented the largest (in Parts 1 and 2; 
number 8 was the largest in the other two parts 
of the experiment). 

A trial consisted of the presentation of a 


single stimulus category for a duration of 7 sec.; 
S was required to make and report his absolute 
judgment of the circle size within the first 5 sec. 
of this period. The code number appeared 
simultaneously with the stimulus during the 
remaining 2 sec. when immediate knowledge 
of results was given. 

Trials were grouped into experimental sessions 
that lasted approximately an hour; each S 
served for only one session on any one day. 
There were 150 trials per session when. 15 
stimulus categories were used, and 160 trials 
per session when eight stimulus categories were 
used. A 5-sec. interval elapsed between suc- 
cessive trials within a given session, and a rest 
period of 5-min. duration was given during 
each session after both the first and second 
thirds of the trials had been completed. 

The series of stimulus categories for a given 
part of the experiment was exposed twice before 
each session—once in ascending order and once 
in descending order. During the test trials, 
however, the order of presentation of the stimuli 
was random within the restriction that each 
stimulus appeared equally often during each 
session. 

Five days of preliminary practice (one session 
per S per day) were given before the experiment 
proper to familiarize S with the stimuli and the 
procedures. These training trials were made 
under the conditions of Part 1, i.e., 15 stimulus 
categories were used and immediate knowledge 
of results was given. After the practice period, 
S served for the four parts of the experiment in 
the following order: five sessions (a total of 750 
trials per S) for Part 1, five sessions (750 trials 
per S) for Part 2, one session (160 trials per S) 
for Part 3, and one session (160 trials per S) for 
Part 4. 

Between Parts 3 and 4 all Ss served for one 
session (150 trials per S) under the conditions 
of Part 2. The data of this one day were used 
as a check on practice effects; since no significant 
improvement in terms of the number of correct 
identifications was found, it was inferred that 
practice per se did not affect the results of Parts 
2, 3, and 4. 


Results 


Information transmitted.—The data 


of each S were arranged into a 
stimulus-response matrix for each 
part of the experiment. From each 
of these matrices, the amount of 
information transmitted was com- 
puted using response information and 
equivocation (8, p. 449, Formula 9); 








88 EARL A. ALLUISI AND RAYMOND C. SIDORSKY 


TABLE 1 


Amount oF Inrormation (Bits/Stimu.us) IN 
ABSOLUTE JUDGMENTS oF CrRcULAR Size 




















Part of Exp. I 
Ss 
1 2 3 4 

1 2.83 2.60 2.72 1.46 

2 2.78 2.73 2.87 2.36 

3 2.63 2.33 2.27 1.67 

4 2.57 2.19 2.55 1.51 

5 2.47 1.99 2.13 1.70 
Mean 2.66 2.37 2.51 1.74 














an independent computational check 
was made by recomputing the in- 
formation transmitted using stimulus 
information and equivocation (8, p. 
450, Formula 12). The results of 
these computations are shown in 
Table 1 for the five Ss and the four 
parts of the experiment. 

The statistical significance of the 
differences between the amounts of 
information transmitted by the aver- 
age S in the four parts of the experi- 
ment (i.e., between the means of 
Table 1) was tested by use of ¢ tests. 
The difference between the means 
of Parts 1 and 2 was statistically 
significant (P < .02). The differ- 
ences between the mean of Part 4 
and the means of the other three parts 
were all statistically significant (P < 
.O1 in each case). The differences be- 
tween the mean of Part 3 and the 
means of Parts 1 and 2 were not sta- 
tistically significant (P > .05 in each 
case). 

Apparently, the amount of in- 
formation transmitted by the average 
S under a given set of experimental 
conditions will be greater the greater 
the range of stimulation used, and 
will be greater when knowledge of 
results is provided. 

ED scaling—The data of all five 
Ss were pooled into a general stimulus- 
response matrix for each part of the 


experiment, and an ED scale was 
derived from each of the four matrices. 
The specific steps in the scaling pro- 
cedure have been described by Garner 
and Hake (8). Briefly summarized, 
the values for each of the four ED 
scales were computed as follows: 


(a) The data (absolute identifications of 
n stimulus categories of circular size from one 
of the parts of the experiment) were arranged in 
the form of a stimulus-response matrix. (b) The 
frequency of responses was cumulated by suc- 
cessive response categories for each stimulus 
separately, and (c) these were then converted 
into proportions. Next, (d) the cumulative 
proportions were converted into equivalent 
standard scores, and (¢) the average difference 
in standard scores was computed for each 
successive pair of response categories. 

At this point, (f) the lowest response category 
was arbitrarily assigned a value of zero, and 
the successive differences (from step ¢) were 
added to obtain a value for each successive 
response category; this should have had the 
effect of spacing the response categories for 
equal discriminability. 

Finally, (g) the stimulus-response categories 
were plotted in the form of standard-score 
cumulative frequencies as a function of the 
adjusted response-category values (derived in 
step f), and (h) the ED-scale value for each 
stimulus category was determined from this 
plot by reading, from a physically equal-interval 
abscissa (R scale), the value corresponding to a 
standard score of zero on the ordinate for the 
specific stimulus category. In other words, 
the ED-scale value was equivalent to the median 
response for a specific stimulus category meas- 
ured in terms of the R scale. 

It should be noted that the resulting ED-scale 
values could be obtained either in arbitrary 
units or in units of cumulative standard scores. 
To obtain the values in standard-score units, 
the same R scale is used both in plotting the 
response categories according to their adjusted 
values (in step g) and in reading off the ED-scale 
values (in step A). When the R scale used in 
reading off the ED-scale values is different from 
the scale used in plotting the response categories, 
the resulting ED-scale values are linear trans- 
formations of the standard-score ED-scale 
values, and the units themselves may be said 
to be arbitrary. As originally outlined by 
Garner and Hake (8, p. 457), the usual procedure 
would seem to be that of obtaining the values 
in standard-score units. 

The range of standard-score units over 
which the ED-scale values varied, however, 





VALIDITY OF EQUAL DISCRIMINABILITY SCALING 89 


was different for the four parts of this experi- 
ment: 23.3, 15.8, 19.5, and 8.4 for Parts 1 
through 4, respectively. These ranges are 
positively correlated with the mean amounts 
of information transmitted as reported in Table 
1. Perhaps they could be used as substitutes 
for the information analysis in estimating the 
absolute level of performance in each of the four 
parts of the experiment as has been suggested 
elsewhere by Cohen (4). 


In order to compare the ED scales 
of the four parts, it was necessary 
to equate the data with respect to 
these differences in the range of 
standard-score units. This was ac- 
complished by linearly transforming 
the standard-score ED-scale values 
to fit arbitrary units between the 
values of 3 and 17. The means 
and ranges of the ED-scale values 
(in these arbitrary units) for the 
four parts of the experiment are 
shown as data points in Fig. 1; the 
curve drawn to these points is, there- 
fore, a composite ED function for the 
absolute judgments of circular size. 

The performance obtained with 
each stimulus category was apparently 
the same in the four parts of the 
experiment, and a single ED function 
appears to represent all the data. 
Because the numerical values of 3 
and 17 mark the end points of both 
the ED scale and the stimulus series, 
the departure of the ED function 
from a straight line at 45° registers 
the departure of perceptual from 
physical equal intervals. Where the 
slope of the function is relatively 
steep (near the beginning and at the 
end points), discrimination is better 
than where it is not so steep (near the 
center categories of the stimulus 
series). 

One question might be raised con- 
cerning the procedure used in con- 
structing the ED scales presented in 
Fig. 1. Each of the four scales was 


constructed from a stimulus-response 
matrix containing the pooled data of 


all five Ss. Would the same ED 
function have been obtained had the 
Ss been treated individually, i.e., 
had an ED scale been constructed 
for each S in each part of the experi- 
ment? 

To obtain an answer to this ques- 
tion, an ED scale for each of the five 
Ss was constructed from the data of 
Part 2. It was thought that these 
data would yield the most reliable 
scales because (a) they were based 
on more judgments than the data of 
Parts 3 and 4, and (6) they indicated 
that more errors were made in the 
judgments of Part 2 than in those 
of Part 1. This second is important 
in ED scaling because the scale is 
derived from the normalized spread of 
responses. 

The means and ranges of the ED- 
scale values (in arbitrary units) for 
the five Ss in Part 2 of the experi- 
ment are shown as data points in 
Fig. 2; the curve drawn to these 
points is the same as that drawn 
to the points of Fig. 1. The abscissa 
and the ordinate of Fig. 2 are, of 
course, the same as those of Fig. 1. 

One stimulus point (diameter of 
3/64 in.) could not be plotted in Fig. 2 





_ 


ED-Scale Volue (Arbditrory Units) 
v 4 
Se: home Ue -- SV mS SU USCmUe Ue 





oneoe@aSERseGFSEBR ZR 


E0-Scole Velue (Stondord-Scole Units for Port |) 


a 
wr 





Fic. 1. Mean and range of ED-scale values 
for the four parts of Exp. I as a function of 
stimulus diameter. 








90 EARL A. ALLUISI AND RAYMOND C. SIDORSK 














7 
“- a 
= 'S- 
=) a 
> 13 
4 
: » 
2a 
~ | 
3 9F 
ic] 7 
> 
— 7r 
3 5 © Meon EO-Scole Value 
3 (Where Range * 02 
pe 5+ Arbitrary Units) 
w fF 4 Mean and Range 
3 (Where Ronge > 02 
Arbitrary Units) 
a ee er ee ee ee ee ae ee ee ee 





3 5 7 9 iT] ‘3 ‘5 i? 
Stimulus Diometer (‘/ea'Stinch) 


Fic. 2. Mean and range of ED-scale values 
for the five Ss in Part 2 of Exp. I as a function 
of stimulus diameter. The ED-scale function 
(solid line) is the same as that in Fig. 1. The 
data obtained were insufficient for the scaling 
of the first stimulus point (stimulus diameter 


of 3/64 in.). 


because the spread of responses made 
by Ss individually was too small for 
scaling. Also, because the scaling of 
the other end point (diameter of 
17/64 in.) could be accomplished 
for only three of the five Ss, the scale 
was arbitrarily “anchored” at the 
next lower stimulus category (diam- 
eter of 16/64 in.). 

In general, except for a little 
greater variability among the points 
for individual Ss, these data appear 
quite similar to those of Fig. 1. 
Apparently, the ED function ob- 
tained with the unpooled data (Fig. 
2) represents essentially as well the 
ED function obtained with the pooled 
data (Fig. 1). 

Response equivocation—The dif- 
ferences in discriminability among the 
stimulus categories are also _illus- 
trated in Fig. 3 where response equiv- 
ocation (i.e., the uncertainty as to 
which response will be given knowing 
what stimulus has been presented; 
see 8, p. 449) is shown for the stimulus 
categories in each of the four parts 


of theexperiment. The bowing of the 
curves indicates that performance near 
the beginning and at the end points 
of the stimulus series was better than 
performance around the center cate- 
gories of the series. 


EXPERIMENT I] 


Equal extents of discriminability 
are supposedly represented by equal 
distances along the ordinate of Fig. 1. 
That is to say, if stimulus values 
are selected from the abscissa so as 
to be spaced at equal intervals along 
the ordinate (determined from the 
points at which they intersect the ED 
function), then those stimuli should 
represent equal probabilities of being 
correctly, or erroneously, identified. 
Experiment II was designed (a) to 
determine whether stimuli so selected 
from the ED function of Exp. I would 
indeed be “equally discriminable,” 
and (b) to measure the amount of 
information transmitted after practice 
with various numbers of equally 
discriminable stimulus categories. 

In this experiment, 18 Ss made 
absolute judgments of the sizes of 
five small circles of light that had 
been selected from the ED scale of 
Exp. 1. The Ss were then randomly 





we 
io} 


~" 
wa 


He 
° 


wu 





Response Equivocation (Bits /Stimulus) 
} 
. a . = VU Tt T \ | . VC we 


° 








— eae eS a ee eae eee ee ee 





A 
at ie 7 9 " 3 S I 
Stimulus Diameter ('/e4'™9 inch) 


Fic. 3. Response equivocation as a function of 
stimulus diameter for the four parts of Exp. I. 





VALIDITY OF EQUAL DISCRIMINABILITY SCALING 


divided into three groups, one of 
which continued to make judgments 
of the five circles, while the other two 
groups judged either seven or nine 
circles. Knowledge of results was 
provided S after each response. 


Method 


Stimuli.—Three sets of stimulus diameters 
were selected from the ED function of Exp. I 
so that the stimulus categories within each of 
the sets were spaced equally according to their 
ED-scale values. One set consisted of five 
categories of stimulation (in arbitrary units, 
at ED-scale values of 3.0, 6.5, 10.0, 13.5, and 
17.0), whereas the second and third sets con 
sisted of seven and nine categories, respectively. 

Apparatus.—The experimental booth and 
display area were as in Exp. I. A standard 
slide projector equipped with a semi-automatic 
slide-changing attachment 
mounted in 2-in. 
used to the stimuli. The stimulus 
slides 40-to-1 negative photographic 
reductions of circles originally drawn at 16 times 
their diametric sizes. ‘The stimuli, as projected, 
appeared at the approximate center of the dis 
play surface. 

Subjects.—Eighteen ranging in age 
between 19 and 21 yr., served as Ss in this ex- 
periment; none had served in Exp. I. All 
normal or corrected 20/20 visual 
acuity. They were employed in the laboratory 
as paid Ss and were experienced in making 
psychophysical judgments of the sort required 
in this study. 

Procedure. 
absolute 


slides 
binders was 


containing 
square metal 
present 


were 


men, 


possessed 


All Ss were instructed to make 
judgments of the sizes of the small 
circles of light that served as stimuli. Presenta- 
tion of the stimuli followed the classical pro- 
cedure for the method of single stimuli. A 
trial consisted of the presentation of a single 
stimulus, S’s verbal report of his judgment as 
to its category of size, and E's verbal response 
to provide immediate knowledge of results. 
When S’s response was incorrect, FE verbally 
reported the correct identification of the stimu- 
lus; otherwise, E mersly said “right.” 

Sessions consisted of either 100, 98, or 99 
trials according to whether 5, 7, or 9 stimulus 
categories were presented. Different random 
orders of stimulus presentation were used for 
each session within the restriction that each 
of the 5, 7, or 9 stimulus categories be presented 
an equal number of times (20, 14, or 11, respec- 
tively). Each S served for only one session 
on any one day. 


© 
Sean 

















3 2 Ff. __ assmum Possiote Pevturmance wth 9 Categores—e4 90 3 
r 465 > 
30> 480 é 
5 /s - 75 
a ww e 
2 Oh -—_—-Marmr wth 7 Coreg es —p- ——_—-470 
4 a aa 6s§ 
of “~ Joos 
} - 
- 24 q 7 s 
§ ee eee sof 
p22 @--© Group 9 (N 6) {453 
3 > e—< Gow? ~e s 
= _ 5 (N°6 
£2q — Al ee Ce 
Categories (Ne 
A. A. of ahh, i A. A. A. A A. i i A. 
'23i:4see6e7869 OH 213 646 
Sessions 
Fic. 4. Information transmitted by the 


average S in each of three experimental groups 
as a function of practice. During Sessions 
1-5, all groups responded to a five-category set 
of stimuli; during Sessions 6-15, Group 5 
continued to respond to the five-category set 
of stimuli, while Groups 7 and 9 responded to 
seven- and nine-category sets, respectively. 


The pool of Ss was randomly divided into 
three groups, each of which contained six Ss. 
All three groups were trained for five sessions 
to make absolute judgments of the five-category 
set of stimuli. Then, for 10 additional sessions, 
while Group 5 continued to judge the five- 
category set, Groups 7 and 9 commenced re- 
sponding to the seven- and the nine-category 
sets of stimuli, respectively. 

During the training sessions, the five different 
circular sizes were identified from smallest to 
largest by the first five letters of the alphabet, 
4 to FE. During the remaining 10 sessions, 
the stimuli were identified by numbers: 7 for 
the smallest, 2 for the next smallest, etc., to 
5, 7, or go for the largest stimulus according to 
the group. During all sessions, E recorded 
all identifications with respect to both the 
stimulus presented and the response made. 


Results 


Information transmitted.—The data 
of each S were arranged into a 
stimulus-response matrix for each 
session, and from each matrix the 
amount of information transmitted 
was computed as in Exp. I. Then, 
the amount of information trans- 
mitted by the average S in each 
group was estimated by taking the 
arithmetic mean of the information 
scores for each session. These data 














92 


are summarized in Fig. 4 for each of 
the three groups separately. 

A chi-square components analysis 
(1, 17) was computed for the data of 
each of the last 10 sessions; this 
analysis amounts to a nonparametric 
simple analysis of variance for each 
session. The results of these analyses 
indicated that the differences in 
performance among the three groups 
were not statistically significant (P 
> .05) during Session 6, but that 
the differences were statistically sig- 
nificant (P < .01) from Session 7 
through Session 15. 

Furthermore, ¢ tests computed from 
the data of Sessions 11 through 15 
indicated that the mean performances 
of the three groups differed signifi- 
cantly (P < .O1 in the case of each 
pair) over those sessions. The groups 
appear to be leveling at about 2.32 
bits/stimulus (5 categories), 2.74 bits / 
stimulus (6.58 categories), and 2.94 
bits/stimulus (7.67 categories), re- 
spectively, for Groups 5, 7, and 9. 

ED scaling.—The data of all six 
Ss in Group 9 were pooled into a 
single stimulus-response matrix for 
Sessions 6 through 15, and an ED 
scale was derived from this matrix. 
The ED-scale values varied over a 
range of 24.7 standard-score units;: 
this range was close to that found, 





e 


o——© Group 9 - Sessions 6-'5 
&——¢ Group 7 - Sessions 6-'5 
o———- Group 5- Sessions 6-5 
——e- Ai! Groups -Sessions |-5 


td) 


L nen me” re 


a ee 


- seemed 





°o 








cee eee eee ee | 
“Ss © * £ eee w 
Stimulus Categories (Arbitrary ED-Scaie Units) 





Response Equivocation (Bits/Stimulus) 
e 
oO 


Fic. 5. Response equivocation for three 
sets of stimuli selected at equal ED-scale 
intervals as a function of their previously 
determined ED-scale values. 


EARL A. ALLUISI AND RAYMOND C. SIDORSKY 


under the similar conditions of Part 
1 in’ Exp. I (23.3 standard-score 
units). 

The ED-scale values obtained were 
then linearly transformed into arbi- 
trary units (as in Fig. 1, a range of 
values between 3 and 17 was used), 
and a Pearson product-moment coef- 
ficient of correlation was computed to 
compare the arbitrary ED-scale values 
derived in the two experiments for 
the nine stimulus categories used by 
Group 9. The resultant coefficient 
of correlation of .995 lends strong 
support to the hypothesis that ED- 
scale values are repeatable with 
different groups of Ss, different num- 
bers of stimulus categories, and dif- 
ferent spacings between stimulus 
categories. 

Response equivocation.—If stimuli 
are selected at equal distances from 
an ED scale, and if equal distances 
on an ED scale do indeed represent 
equal extents of discriminability, then 
it follows that such stimuli should 
also represent equal tendencies to be 
erroneously, or correctly, identified. 
In other words, stimuli selected from 
an ED scale to represent equal 
extents of discriminability should, 
if they equally discriminable 
stimuli, yield flat curves of response 
equivocation.® 

A test of the validity of the ED 
scale derived in Exp. I (Fig. 1) is 
possible, therefore, because the three 
sets of stimuli used in the present 
experiment were selected from the 
ED function so that the stimulus 
categories within each of the sets 


are 


3 From the theoretical point of view, informa- 
tion measures include measures of “equivoca- 
tion,” not errors. It is theoretically possible, 
but not necessary, for percentage errors and 
equivocation to be highly correlated. In 
practice, such a high correlation is generally 
found in situations where men make absolute 
judgments of stimuli (e.g., see 3). 








VALIDITY OF EQUAL DISCRIMINABILITY SCALING 93 


were spaced equally according to their 
ED-scale values. 

In Fig. 5, the response equivocation 
for the stimulus categories used in 
this experiment is plotted on the 
same ordinate that was used in Fig. 3 
for the data of Exp. I. In the present 
case, however, the curves appear to 
be relatively flat. This indicates 
that performance with any one stimu- 
lus category within one of the three 
sets of stimuli was equivalent to 
performance with the other stimuli 
in the same set. This finding sup- 
ports the hypothesis that stimuli 
will represent equal extents of dis- 
criminability if they are selected at 
equal intervals from an ED scale. 


Discussion 


Several studies (e.g., 5, 6, 12, 13, 15) 
reviewed elsewhere (2) indicate that 
certain conditions of experimentation 
affect the amount of information trans- 
mitted in making absolute judgments. 
The results of Exp. I corroborate this 
in showing that the knowledge-of-results 
conditions and the total range of stimula- 
tion used experimentally may affect the 
amount of information transmitted. 

Furthermore, the results of Exp. II 
corroborate the findings of Garner (6, 7) 
and the prediction of Garner and Hake 
(8) that“. . . greater information trans- 
mission will occur when the stimuli are 
spaced according to a criterion of equal 
discriminability” (6, p. 238). The great- 
est amount of information transmitted 
in Exp. I was 2.66 bits/stimulus in 
Part 1 (largest range of stimulation with 
knowledge of results provided S after 
each response). Three different levels 
of performance were reached with the 
three sets of equally discriminable stimuli 
in Exp. II. 

Group 5 was performing very near 
the maximum possible for five categories 
(2.32 bits/stimulus), but the final per- 
formance of Group 9 (2.94 bits/stimulus) 
exceeded that for Group 7 (2.74 bits/ 
stimulus), and they both exceeded the 





2.66 bits/stimulus obtained after ap- 
proximately equivalent practice with 
the non-ED stimulus spacings of Part 1 
in Exp. I. 

These findings imply that one must 
interpret the results of any specific 
information analysis as giving only an 
approximate estimate of the minimum 
number of stimulus categories that can 
be used to transmit the maximum 
amount of information. The estimate 
should be modified as the conditions of 
experimentation vary (e.g., range, spac- 
ing, knowledge of results, number of 
response categories; see 2). 

The technique of ED scaling appears 
to have been validated by the data 
of these two experiments to the extent 
that essentially identical ED-scale func- 
tions were obtained (a) under the dif- 
ferent sets of experimental conditions 
of Exp. I as was shown in Fig. 1, (4) 
with different individual Ss under the 
same set of conditions in Part 2 of Exp. I 
as was shown in Fig. 2, (c) with different 
groups of Ss under similar experimental 
conditions as indicated by the high 
correlation between the ED-scale values 
of Exp. II and those of Part 1 in Exp. I, 
and to the extent that (d) the curves 
of response equivocation (Fig. 5) were 
flat for sets of stimuli selected at equal 
intervals from the ED scale derived in 
Exp. I. These findings corroborate and 
extend those of Garner (6, 7) for absolute 
judgments of loudness. 

The first three of these four results 
indicate that the re/ative levels of per- 
formance among various stimulus cate- 
gories in an absolute-judgment task are 
independent of (at least certain) differing 
conditions of experimentation. The ED- 
scale values, in arbitrary units, were 
independent of (a) the conditions of 
knowledge of results, (4) the total range 
of stimulation, (c) the spacing used 
between adjacent stimulus categories, 
(d@) the number of stimulus and response 
categories within the limits of those 
employed in these experiments, (e) 
the differences among individual Ss, and 
(f) the differences between groups of Ss. 

To the extent that the absolute level 











94 EARL A. ALLUISE AND RAYMOND C. SIDORSKY 


of performance (the amount of informa- 
tion transmitted by the average S) was 
not independent of these conditions, the 
relative may be said to be independent 
of the absolute levels of performance 
attained under the different conditions 
of experimentation. These generaliza- 
tions hold only for ED-scale values 
reported in the same arbitrary units, 
not in standard-score units that reflect 
differences in the absolute levels of 
performance. 

The final one of these four results is, 
perhaps, the clearest empirical validation 
of ED scaling. It indicates that there 
should be an affirmative answer to the 
question: “If stimulus values are selected 
from the abscissa so as to be spaced at 
equal intervals among the ordinate of 
the ED scale, will the stimuli be equally 
discriminable?”” With response equiv- 
ocation as a performance measure of 
discriminability, the stimuli so selected 
from the ED scale of Exp. I were, indeed, 
equally discriminable according to the 
data of Exp. IT (Fig. 5). 


SUMMARY 


Two experiments were conducted in which 
Ss made absolute judgments of the sizes of 
small circles of light. Five Ss made judgments 
in Exp. I under different conditions of (a) 
knowledge of results, (b) range of stimulation, 
(c) spacing between adjacent stimulus cate- 
gories, and (d) number of stimulus and response 
categories. In Exp. II, 18 Ss made absolute 
judgments of the sizes of five small circles of 
light that had been selected from an ED scale 
constructed in Exp. I. The Ss were then 
randomly divided into three groups, one of 
which continued to make judgments of the 
five circles, while the other two groups judged 
either seven or nine circles selected from the 
ED scale. Knowledge of results was provided 
S after each response in Exp. II. 
were as follows: 

1. The amount of information transmitted 
by the average S, measured in bits/stimulus, 
was increased (a) when the range of stimulation 
used experimentally was increased, (b) when 
knowledge of res was provided, and (c) when 
the stimuli usel «ere spaced according to a 
criterion of equal discriminability. 

2. Essentially identical ED-scale functions 
were obtained under the different sets of experi 
mental conditions. Apparently, ED-scale val- 


The results 


ues for a given stimulus dimension, in arbitrary 
units, are independent of (a) the conditions of 
knowledge of results, (b) the total range of 
stimulation, (c) the spacing used between 
adjacent stimulus categories, (d) the number of 
stimulus and response categories, (¢) the dif- 
ferences among individual Ss, and (f) the dif- 
ferences between groups of Ss. 

It appears, in conclusion, that ED scales can 
be used validly for selecting equally discrimin 
able stimuli independently of the number of 
stimuli to be selected, and independently of the 
tinal absolute levels of performance attainable 
with those stimuli. 


REFERENCES 


1. Autuiss, E. A. formulae 
for a distribution-free test of analysis-of 
variance hypotheses. USAF WADC 
Tech. Rep., 1956, No. 56-339. 

2. Auuuss1, E. A. 
amount 


Computational 


Conditions affecting the 

f information in absolute judg 

Psychol. Ree., 1957, 64, 97-103 

3. Cuapanis, A., & Hatsey, R. M Absolute 
judgments of spectrum colors. J. Psy- 
chol., 1956, 42, 99-103. 

4+. Conen, J. Binocular disparity as a coding 
dimension for pictorial instrument and 


ments. 


radar displays. USAF WADC Tech. 
Rep., 1935, No, 55-393. 
5. Eriksen, C. W., & Haxe, H.W. Absolute 


stimulus 
stimulus and 
Psyc hol P 


judgments as a function of 
range and number of 
response categories i; 
1955, 49, 323-332. 

. Garner, W. R. An equal discriminability 
scale for loudness judgments. J. exp. 
Psychol., 1952, 49, 232-238. 

7. Garner, W. R. An informational analysis 
of loudness. J. exp. Psychol., 1953, 46, 
373-380. 

S. Garner, W. R., & Haxe, H. W. The 
amount of information in absolute judg- 
ments. Psychol. Rev., 1951, 58, 446-459. 

9. Mitten, G. A. What is information 
measurement? Amer. Psychologist, 1953, 
8, 3-12. 

Mitier, G. A. The magical number seven, 
plus or minus two: some limits on our 
capacity for processing information. 
Psychol. Rev., 1956, 63, 81-97. 

11. Mutver, P. F., Jr., Stporsxy, R. C., 
Survinske, A. J., Atiuisi, E. A,. & 

Fitts, P. M. ‘The symbolic coding of 
information on cathode ray tubes and 
similar displays. USAF WADC Tech. 
Rep., 1955, No. 55-375. 


exp. 


10. 














VALIDITY OF EQUAL DISCRIMINABILITY SCALING 95 


12. Potrack, 1. The information of elementary 
auditory displays. J. acoust. Soc. Amer., 
1952, 24, 745-749. 

13. Pottack,I. The information of elementary 
auditory displays. II. J. acoust. Soc. 
Amer., 1953, 25, 765-769. 

14. Quastier, H. (Ed.) 
psychology. 
1955. 

15. Scuiprer, L. An analysis of information 
transmitted to human with 
auditory signals as a function of number 


Information theory in 


Glencoe, Ill.: Free Press, 


observers 


Journal of Experimental Psychology 
Vol. $5, No. 1, 1958 


of stimuli and stimulus intensity interval 

size. Unpublished doctor’s dissertation, 
Univer. Wisconsin, 1953. 

16. Suannon, C., & Weaver, W. 


matical theory o 


The mathe- 
communication. Ur- 
bana: Univer. Illinois Press, 1949. 

17. WiLson, is: We \ 

of analysis of variance hypotheses 


Psychol. Bull., 1956, 53, 96-101. 


distribution-free test 


Received January 28, 1957 


SUPPLEMENTARY REPORT: INTERLIST INTERFERENCE AND THE 


RETENTION OF 


PAIRED CONSONANT SYLLABLES 


BENTON J. UNDERWOOD 
Northwestern University ' 
AND 
JACK RICHARDSON 
Harpur College 


Previous studies found that distributed 
practice facilitated 24-hr. retention of serial 
consonant lists when interlist interference was 
high (2), but had no effect on retention of paired 
consonant lists under comparable conditions of 
interference (3). It was suggested that a 
lower degree of learning, a shorter retention 
interval, or both, might result in better retention 
of these paired consonant lists following learning 

1R. W. Schultz supervised the gathering of 
the data. This work was done under N7onr- 


45008, Project NR 154-057, between North- 
western University and The Office of Naval 
Research. 


by distributed practice than following learning 
by massed practice. 

Method.—The lists, method, rate of presenta- 
tion, and sequence of learning successive lists 
was the same as for the four-list groups in the 
previous experiment using paired consonant 
syllables (3). Four groups of 32 Ss each learned 
all four lists. Two of the groups received 10 
trials of learning on each list while the other 
two groups received 25 trials on each list. 
One group from each degree of learning learned 
all lists with a 4sec. intertrial interval while the 
other two groups learned the first and last list 
with a 60-sec. intertrial interval and the other 
two lists with a 4-sec. interval. All Ss re- 





96 BENTON J. UNDERWOOD AND JACK RICHARDSON 


TABLE 1 


Mean Torta Correct Responses On 10 anv 25 Triats or LeEarninc By Massep (M) 
anp DistrisuTtep (D) Practice, anp Mean Loss Scores Over THE 
1-Hr. RETENTION INTERVAL 























Total Responses Loss Scores 
. a cnaaies 
Trials | First List Last List | First List | Last List 
rs _— — |— 
M | D mM | D M D | M | D 

10 10.75 | 12.31 | 1050 | 13.25 67 67 | 118 1.05 

25 40.03 44.31 34.66 | 51.20 1.02 1.23 1.29 | 1.63 
a ' 





ceived five trials of relearning on the first and 
last lists 1 hr. after learning the lists. 

Results —An analysis of variance of the total 
correct responses during learning showed that 
distribution of practice facilitated learning for 
all groups and lists but the differences were 
significant only with the higher degree of learning 
when the learning had been preceded by three 
other lists, i.e., there was an interaction between 
intertrial interval and degree of learning and 
between intertrial interval and number of lists 
learned previously. The mean values are shown 
in Table 1. 

Loss scores over the retention interval were 
computed by subtracting the obtained recall 
from the predicted score (1). There was no 


indication (see Table 1) that intertrial interval 
interacts with length of retention interval or 
with degree of learning. 


l 








REFERENCES 


. Ricuarpson, J., & Unperwoop, B. J. Com- 
paring retention of verbal lists after 
different rates of acquisition. J. gen. 
Psychol., 1957, $6, 187-192. 

. Unperwoop, B. J., & RuicHarpson, J. 
Studies of distributed practice: XIII 
Interlist interference and the retention 
of serial nonsense lists. J. exp. Psychol., 
1955, 50, 39-46. 

. Unperwoop, B. J., & Ruicnarpson, J. 
Studies of distributed practice: XVII. 
Interference and the retention of paired 
consonant syllables. /. Psychol., 
1957, $4, 274-279. 


exp. 


(Received August 29, 1957) 











NOW AVAILABLE 


Publication 
Manual 


of the 
American 
Psychological 
Association 


1957 Revision 





A revision of the 1952 Manual, 
detailed instructions are given 
for the preparation of scientific 
articles. Organization and 
presentation of tabular mate- 
rial, figures and graphs, and 
reference lists are included. 
All scientists who are writing 
for publication will find the 
Publication Manual an indis- 
pensable guide. 





Price, $1.00 


Discounts for quantity orders 
over fifty copies 


Order from 
AMERICAN PSYCHOLOGICAL 
ASSOCIATION 
Publications Office 
1333 Sixteenth Street, N. W. 
Washington 6, D. C. 











ARE THERE GAPS IN YOUR 
FILES OF APA JOURNALS? 





Then hear this .. . 


The American Psychological Association announces a sale 
during the period October 1957 through March 1958. Of 
the following journals, all available issues in the volumes 
for the years preceding 1951 will be offered at a price of 
only 50¢ (foreign, 60¢) per issue: 

American Psychologist 

Journal of Abnormal & Social Psychology 

Journal of Applied Psychology 


Journal of Comparative & Physiological Psychology 
(1947-1950 only) 


Journal of Consulting Psychology 
Journal of Experimental Psychology 
Psychological Abstracts 
Psychological Bulletin 
Psychological Index 

(a few complete volumes, some shopworn) 
Psychological Monographs 
Psychological Review 


Not all issues in all volumes are available. But—ORDER 
NOW before more back issues go out of print. From our 
available stock we will complete as much of your order as 
possible at this reduced price and for this limited period. 


Delivery: No dealer or quantity 
6 to 8 weeks discounts 


After this sale, for the years preceding 1948, journals will be available 
only on microfilm end microcard. 


Order from: 


American Psychological Association 
Department BB 
1333 Sixteenth Street, N.W. 
Washington 6, D. C. 





AQP &€ BM Pen 


AMGRAD = 





